Benchmark

TERM
Benchmark
DEFINITION
Standardized test measuring AI performance on specific tasks (MMLU, HumanEval)