Scenario guide
Best AI models for Realtime Chat / Voice
A streaming consumer chat or voice assistant. Speed and time-to-first-token matter as much as raw quality — a slightly less smart model that responds instantly often beats a frontier model that pauses to think.
Rankings use the same scenario weights and cost blending as the interactive leaderboard on AI Model Analyzer. Data is min-max normalised per benchmark; missing scores are skipped without penalty.
- 1Gemini 2.0 FlashGoogleScore 91.3Q 89.9In $0.10/M
- 2DeepSeek V3DeepSeekScore 89.9Q 95.8In $0.27/M
- 3Gemini 1.5 FlashGoogleScore 88.9Q 84.1In $0.08/M
- 4DeepSeek R1DeepSeekScore 86.4Q 96.3In $0.55/M
- 5Gemini 3 FlashGoogleScore 84.5Q 92.3In $0.30/M
- 6Gemini 2.5 FlashGoogleScore 83.0Q 90.1In $0.30/M
- 7GLM-4.6Zhipu AI (GLM)Score 83.0Q 90.1In $0.50/M
- 8Qwen3 235B (Thinking)Alibaba (Qwen)Score 82.3Q 81.4In $0.20/M
- 9DeepSeek V3 (Thinking)DeepSeekScore 82.2Q 84.8In $0.27/M
- 10GLM-4.7Zhipu AI (GLM)Score 81.3Q 87.8In $0.50/M
- 11GPT-5.4OpenAIScore 81.2Q 100.0In $1.50/M
- 12GPT-5.5OpenAIScore 80.4Q 98.9In $1.50/M
- 13Kimi K2Moonshot (Kimi)Score 77.1Q 84.0In $0.60/M
- 14Gemini 3 ProGoogleScore 76.5Q 91.9In $1.25/M
- 15GPT-5.1OpenAIScore 75.4Q 90.4In $1.25/M