Logo
Bitrecs Dashboard
LeaderboardSOTAEvals
Docs
Back

Validator · Set #2

ndcg_at10_bm25_appliances_1000_medium

Pass Rate

33.3%

Avg Time

594.2s

Total Runs

227

Passed

31%

Failed

63%

Errors

21896%

Methodology

Reference PaperarXiv:1506.08839

Inferring Networks of Substitutable and Complementary Products

Julian McAuley · Rahul Pandey · Jure Leskovec

This evaluation uses the Amazon product graph dataset introduced in the paper. Recommendations are scored against ground-truth substitute and complementary edges using BM25 retrieval and ranked with NDCG@10 and Recall@K metrics.

Read on arXiv

All Runs

98 shown
#Artifact
Scored
Score
Duration
Latest
1
MO
MONROE
v1 · f2752b1f
4/421.3%431.40s5/4/2026
2
re
rec
v1 · d22dd39c
1/40.0%453.13s5/5/2026
3
GA
GARFIELD
v1 · d5f78b42
4/50.0%228.31s5/4/2026
4
GR
GRANT
v1 · ff9c56f6
0/4——5/6/2026
5
MA
MAX
v1 · d9476dcf
0/4——5/5/2026
6
BU
BUCHANAN
v1 · 24ddd201
0/4——5/4/2026
7
ca
candidate
v1 · 76825e93
0/4——5/4/2026
8
bi
bitrecs recommender
v1 · 46b1db45
0/4——5/4/2026
9
To
Top
v1 · acbca673
0/4——5/4/2026
10
ca
canary
v1 · f48c8cff
0/4——5/4/2026
11
UL
ULYSSES
v1 · 9ef0c4d8
0/4——5/3/2026
12
pr
preliminary
v1 · 74ca94c1
0/4——5/3/2026
13
ge
gerhig
v1 · ccf00759
0/4——5/3/2026
14
PA
PATTON
v1 · 4c8e70f5
0/5——5/3/2026
15
sk
sketch
v1 · 62789429
0/4——5/3/2026
16
Ra
Ranker
v1 · 69a0f7eb
0/5——5/2/2026
17
dr
draft
v1 · 6da53bdc
0/4——5/1/2026
18
ge
gemma catalog ranker
v1 · afbc15a1
0/5——5/1/2026
19
re
reason
v1 · a3bd38aa
0/4——5/1/2026
20
Tu
Turbo Recs
v1 · d9773028
0/5——5/1/2026
21
ru
rules
v1 · 90bb3fb7
0/4——5/1/2026
22
ma
max
v1 · bb9cec44
0/4——4/30/2026
23
ts
tst
v1 · d1e9d5bb
0/5——4/30/2026

Test Cases (1)

Test NameCategoryRunsPassFailPass Rate
ndcg_at10_bm25_appliances_1000_mediumdefault936
33%

Errors

Code 200214(98%)

An internal error occurred on the validator

Code 3003(1%)

The platform was restarted while the evaluation run was pending

Code 3041(0%)

The platform was restarted while the evaluation run was running the evaluation