Validator · Set #2

ndcg_at10_bm25_appliances_1000_medium

Pass Rate

33.3%

Avg Time

594.2s

Total Runs

227

Passed

31%

Failed

63%

Errors

21896%

Methodology

Reference PaperarXiv:1506.08839

Inferring Networks of Substitutable and Complementary Products

Julian McAuley · Rahul Pandey · Jure Leskovec

This evaluation uses the Amazon product graph dataset introduced in the paper. Recommendations are scored against ground-truth substitute and complementary edges using BM25 retrieval and ranked with NDCG@10 and Recall@K metrics.

Read on arXiv

All Runs

98 shown

#	Artifact	Scored	Score	Duration	Latest
1	MO MONROE v1 · f2752b1f	4/4	21.3%	431.40s	5/4/2026
2	re rec v1 · d22dd39c	1/4	0.0%	453.13s	5/5/2026
3	GA GARFIELD v1 · d5f78b42	4/5	0.0%	228.31s	5/4/2026
4	GR GRANT v1 · ff9c56f6	0/4	—	—	5/6/2026
5	MA MAX v1 · d9476dcf	0/4	—	—	5/5/2026
6	BU BUCHANAN v1 · 24ddd201	0/4	—	—	5/4/2026
7	ca candidate v1 · 76825e93	0/4	—	—	5/4/2026
8	bi bitrecs recommender v1 · 46b1db45	0/4	—	—	5/4/2026
9	To Top v1 · acbca673	0/4	—	—	5/4/2026
10	ca canary v1 · f48c8cff	0/4	—	—	5/4/2026
11	UL ULYSSES v1 · 9ef0c4d8	0/4	—	—	5/3/2026
12	pr preliminary v1 · 74ca94c1	0/4	—	—	5/3/2026
13	ge gerhig v1 · ccf00759	0/4	—	—	5/3/2026
14	PA PATTON v1 · 4c8e70f5	0/5	—	—	5/3/2026
15	sk sketch v1 · 62789429	0/4	—	—	5/3/2026
16	Ra Ranker v1 · 69a0f7eb	0/5	—	—	5/2/2026
17	dr draft v1 · 6da53bdc	0/4	—	—	5/1/2026
18	ge gemma catalog ranker v1 · afbc15a1	0/5	—	—	5/1/2026
19	re reason v1 · a3bd38aa	0/4	—	—	5/1/2026
20	Tu Turbo Recs v1 · d9773028	0/5	—	—	5/1/2026
21	ru rules v1 · 90bb3fb7	0/4	—	—	5/1/2026
22	ma max v1 · bb9cec44	0/4	—	—	4/30/2026
23	ts tst v1 · d1e9d5bb	0/5	—	—	4/30/2026

Test Cases (1)

Test Name	Category	Runs	Pass	Fail	Pass Rate
ndcg_at10_bm25_appliances_1000_medium	default	9	3	6	33%