Scenario guide

Best AI models for Local / Self-Hosted Coding

A coding assistant you run on your own hardware — no API bills, full privacy, works offline. Restricted to open-weight, self-hostable models and weighted toward practical coding and agentic ability, with a slice for the self-correction that smaller local models lean on inside a good harness. There is no cost axis because inference runs on hardware you already own.

Rankings use the same scenario weights and cost blending as the interactive leaderboard on AI Model Analyzer. Data is min-max normalised per benchmark; missing scores are skipped without penalty.

1
DeepSeek R1
DeepSeek
Score 86.2Q 86.2In $0.55/M
2
DeepSeek V3 (Thinking)
DeepSeek
Score 77.9Q 77.9In $0.27/M
3
Kimi K2
Moonshot (Kimi)
Score 75.0Q 75.0In $0.60/M
4
Qwen3 235B
Alibaba (Qwen)
Score 73.1Q 73.1In $0.20/M
5
Qwen3 235B (Thinking)
Alibaba (Qwen)
Score 63.1Q 63.1In $0.20/M
6
DeepSeek V3
DeepSeek
Score 61.0Q 61.0In $0.27/M
7
GLM-4.7
Zhipu AI (GLM)
Score 60.4Q 60.4In $0.50/M
8
GLM-4.6
Zhipu AI (GLM)
Score 60.2Q 60.2In $0.50/M
9
Llama 3.3 70B Instruct
Meta
Score 58.3Q 58.3In $0.88/M
10
Qwen2.5 72B Instruct
Alibaba (Qwen)
Score 50.7Q 50.7In $0.90/M
11
Llama 3.1 70B Instruct
Meta
Score 46.6Q 46.6In $0.88/M
12
Llama 4 Scout
Meta
Score 24.9Q 24.9In $0.18/M
13
Llama 3.1 405B Instruct
Meta
Score 20.8Q 20.8In $3.50/M
14
Mistral Large 2
Mistral
Score 20.3Q 20.3In $2.00/M
15
Llama 4 Maverick
Meta
Score 19.2Q 19.2In $0.27/M

Open interactive leaderboard Build custom weights Home