Value ranking
Best value on LiveCodeBench
Continuously refreshed coding benchmark drawing from LeetCode, AtCoder, and Codeforces; reduces benchmark contamination.
“Value” is normalized benchmark score (0–100 for this leaderboard cohort) divided by input price per million tokens. Higher means more capability per dollar on this axis only — always sanity-check latency, context length, and your real workload.
- 1Qwen3 235BAlibaba (Qwen)433.2086.6 / $0.20/M
- 2Gemini 2.5 FlashGoogle254.6376.4 / $0.30/M
- 3DeepSeek R1DeepSeek171.5194.3 / $0.55/M
- 4DeepSeek V3DeepSeek100.4127.1 / $0.27/M
- 5o4-miniOpenAI90.91100.0 / $1.10/M
- 6Gemini 2.5 ProGoogle75.3194.1 / $1.25/M
- 7o3-miniOpenAI61.6267.8 / $1.10/M
- 8Claude Sonnet 4 (Thinking)Anthropic21.2563.8 / $3.00/M
- 9Claude Sonnet 4Anthropic15.3946.2 / $3.00/M
- 10o3OpenAI9.5195.1 / $10.00/M
- 11Claude 3.5 SonnetAnthropic8.5125.5 / $3.00/M
- 12Claude Opus 4 (Thinking)Anthropic4.4967.4 / $15.00/M
- 13Claude Opus 4Anthropic3.4651.9 / $15.00/M
- 14GPT-4oOpenAI2.155.4 / $2.50/M
- 15GPT-4 TurboOpenAI0.353.5 / $10.00/M
- 16GPT-4o miniOpenAI0.000.0 / $0.15/M
AI Model Analyzer does not recommend specific vendors; rankings are derived from public data only.