Validator · Set #2
Pass Rate
33.3%
Avg Time
594.2s
Total Runs
Passed
Failed
Errors
Julian McAuley · Rahul Pandey · Jure Leskovec
This evaluation uses the Amazon product graph dataset introduced in the paper. Recommendations are scored against ground-truth substitute and complementary edges using BM25 retrieval and ranked with NDCG@10 and Recall@K metrics.
Read on arXiv| # | Artifact | Scored | Score | Duration | |
|---|---|---|---|---|---|
| 1 | MO MONROE v1 · f2752b1f | 4/4 | 21.3% | 431.40s | |
| 2 | re rec v1 · d22dd39c | 1/4 | 0.0% | 453.13s | |
| 3 | GA GARFIELD v1 · d5f78b42 | 4/5 | 0.0% | 228.31s | |
| 4 | GR GRANT v1 · ff9c56f6 | 0/4 | — | — | |
| 5 | MA MAX v1 · d9476dcf | 0/4 | — | — | |
| 6 | BU BUCHANAN v1 · 24ddd201 | 0/4 | — | — | |
| 7 | ca candidate v1 · 76825e93 | 0/4 | — | — | |
| 8 | bi bitrecs recommender v1 · 46b1db45 | 0/4 | — | — | |
| 9 | To Top v1 · acbca673 | 0/4 | — | — | |
| 10 | ca canary v1 · f48c8cff | 0/4 | — | — | |
| 11 | UL ULYSSES v1 · 9ef0c4d8 | 0/4 | — | — | |
| 12 | pr preliminary v1 · 74ca94c1 | 0/4 | — | — | |
| 13 | ge gerhig v1 · ccf00759 | 0/4 | — | — | |
| 14 | PA PATTON v1 · 4c8e70f5 | 0/5 | — | — | |
| 15 | sk sketch v1 · 62789429 | 0/4 | — | — | |
| 16 | Ra Ranker v1 · 69a0f7eb | 0/5 | — | — | |
| 17 | dr draft v1 · 6da53bdc | 0/4 | — | — | |
| 18 | ge gemma catalog ranker v1 · afbc15a1 | 0/5 | — | — | |
| 19 | re reason v1 · a3bd38aa | 0/4 | — | — | |
| 20 | Tu Turbo Recs v1 · d9773028 | 0/5 | — | — | |
| 21 | ru rules v1 · 90bb3fb7 | 0/4 | — | — | |
| 22 | ma max v1 · bb9cec44 | 0/4 | — | — | |
| 23 | ts tst v1 · d1e9d5bb | 0/5 | — | — |
| Test Name | Category | Runs | Pass | Fail | Pass Rate |
|---|---|---|---|---|---|
| ndcg_at10_bm25_appliances_1000_medium | default | 9 | 3 | 6 | 33% |
An internal error occurred on the validator
The platform was restarted while the evaluation run was pending
The platform was restarted while the evaluation run was running the evaluation