Validator · Set #2
Pass Rate
15.4%
Avg Time
593.3s
Total Runs
Passed
Failed
Errors
Julian McAuley · Rahul Pandey · Jure Leskovec
This evaluation uses the Amazon product graph dataset introduced in the paper. Recommendations are scored against ground-truth substitute and complementary edges using BM25 retrieval and ranked with NDCG@10 and Recall@K metrics.
Read on arXiv| # | Artifact | Scored | Score | Duration | |
|---|---|---|---|---|---|
| 1 | MO MONROE v1 · f2752b1f | 4/4 | 6.3% | 410.36s | |
| 2 | ts tst v1 · d1e9d5bb | 2/5 | 5.0% | 496.33s | |
| 3 | GA GARFIELD v1 · d5f78b42 | 5/5 | 1.0% | 310.93s | |
| 4 | GR GRANT v1 · ff9c56f6 | 0/4 | — | — | |
| 5 | re rec v1 · d22dd39c | 0/4 | — | — | |
| 6 | MA MAX v1 · d9476dcf | 0/4 | — | — | |
| 7 | BU BUCHANAN v1 · 24ddd201 | 0/4 | — | — | |
| 8 | ca candidate v1 · 76825e93 | 0/4 | — | — | |
| 9 | bi bitrecs recommender v1 · 46b1db45 | 0/4 | — | — | |
| 10 | To Top v1 · acbca673 | 0/4 | — | — | |
| 11 | ca canary v1 · f48c8cff | 0/4 | — | — | |
| 12 | UL ULYSSES v1 · 9ef0c4d8 | 0/4 | — | — | |
| 13 | pr preliminary v1 · 74ca94c1 | 0/4 | — | — | |
| 14 | ge gerhig v1 · ccf00759 | 0/4 | — | — | |
| 15 | PA PATTON v1 · 4c8e70f5 | 0/5 | — | — | |
| 16 | sk sketch v1 · 62789429 | 0/4 | — | — | |
| 17 | Ra Ranker v1 · 69a0f7eb | 0/5 | — | — | |
| 18 | dr draft v1 · 6da53bdc | 0/4 | — | — | |
| 19 | ge gemma catalog ranker v1 · afbc15a1 | 0/5 | — | — | |
| 20 | re reason v1 · a3bd38aa | 0/4 | — | — | |
| 21 | Tu Turbo Recs v1 · d9773028 | 0/5 | — | — | |
| 22 | ru rules v1 · 90bb3fb7 | 0/4 | — | — | |
| 23 | ma max v1 · bb9cec44 | 0/4 | — | — |
| Test Name | Category | Runs | Pass | Fail | Pass Rate |
|---|---|---|---|---|---|
| recall_at10_bm25_electronics_500_medium | default | 13 | 2 | 11 | 15% |
An internal error occurred on the validator
The platform was restarted while the evaluation run was pending