apps/learning-api/evals-playground/TEST_SET.md
Quiz Eval Test Set
The quiz eval test set is a 45-deck manifest defined in test_set.py. It is used by the eval pipeline to run paired comparisons across audience, subject, and language slices.
Quiz Eval Test Set
The quiz eval test set is a 45-deck manifest defined in test_set.py. It is
used by the eval pipeline to run paired comparisons across audience, subject,
and language slices.
Composition
Audience:
- university: 18 decks
- highschool: 18 decks
- other: 9 decks
Language:
- English: 23 decks
- German: 10 decks
- French: 7 decks
- Dutch: 5 decks
Subject:
- base/highschool: 18 decks
- other: 9 decks
- business: 4 decks
- law: 4 decks
- medicine: 4 decks
- humanities: 2 decks
- science: 2 decks
- social sciences: 2 decks
Local Source Files
Raw source documents live in a local-only folder:
Test-set/
That folder is gitignored because it contains eval materials and can be large. The expected layout is:
Test-set/
university/{subject}/{language}/{deck_id}__{name}.pdf
university/{subject}/{language}/{deck_id}__{name}.docx
university/{subject}/{language}/{deck_id}__{name}.pptx
highschool/{language}/{deck_id}__{name}.pdf
other/{language}/other/{deck_id}__{name}.pdf
Only files whose names start with a test_set.py deck id are picked up by the
local ingestion script.