Scenario guide
Best AI models for Local / Self-Hosted Coding
A coding assistant you run on your own hardware — no API bills, full privacy, works offline. Restricted to open-weight, self-hostable models and weighted toward practical coding and agentic ability, with a slice for the self-correction that smaller local models lean on inside a good harness. There is no cost axis because inference runs on hardware you already own.
Rankings use the same scenario weights and cost blending as the interactive leaderboard on AI Model Analyzer. Data is min-max normalised per benchmark; missing scores are skipped without penalty.
- 1DeepSeek R1DeepSeekScore 86.2Q 86.2In $0.55/M
- 2DeepSeek V3 (Thinking)DeepSeekScore 78.3Q 78.3In $0.27/M
- 3Kimi K2Moonshot (Kimi)Score 77.2Q 77.2In $0.60/M
- 4Qwen3 235BAlibaba (Qwen)Score 73.2Q 73.2In $0.20/M
- 5Qwen3 235B (Thinking)Alibaba (Qwen)Score 66.5Q 66.5In $0.20/M
- 6GLM-4.7Zhipu AI (GLM)Score 63.2Q 63.2In $0.50/M
- 7DeepSeek V3DeepSeekScore 61.0Q 61.0In $0.27/M
- 8GLM-4.6Zhipu AI (GLM)Score 60.7Q 60.7In $0.50/M
- 9Llama 3.3 70B InstructMetaScore 58.5Q 58.5In $0.88/M
- 10Qwen2.5 72B InstructAlibaba (Qwen)Score 51.2Q 51.2In $0.90/M
- 11Llama 3.1 70B InstructMetaScore 46.8Q 46.8In $0.88/M
- 12Llama 4 ScoutMetaScore 25.4Q 25.4In $0.18/M
- 13Llama 3.1 405B InstructMetaScore 21.2Q 21.2In $3.50/M
- 14Mistral Large 2MistralScore 20.6Q 20.6In $2.00/M
- 15Llama 4 MaverickMetaScore 19.4Q 19.4In $0.27/M