Value ranking
Best value on MMLU Pro
Harder version of MMLU testing knowledge across 57 academic subjects; reduces guessing-friendly answers.
“Value” is normalized benchmark score (0–100 for this leaderboard cohort) divided by input price per million tokens. Higher means more capability per dollar on this axis only — always sanity-check latency, context length, and your real workload.
- 1Qwen2.5 72B InstructAlibaba (Qwen)111.11100.0 / $0.90/M
- 2Llama 3.3 70B InstructMeta84.3874.3 / $0.88/M
- 3Llama 3.1 70B InstructMeta82.1572.3 / $0.88/M
- 4Mixtral 8x22BMistral0.000.0 / $1.20/M
AI Model Analyzer does not recommend specific vendors; rankings are derived from public data only.