What's the Cheapest LLM for Coding?

Finding the cheapest LLM for coding requires balancing price with coding ability. Here are the most affordable coding-capable models, ranked by a weighted cost metric (60% input, 40% output):

Phi-4 (Microsoft): $0.070/M input, $0.140/M output. Coding ELO: 1130. Speed: 160 tok/s.

Gemini 2.0 Flash Lite (Google): $0.075/M input, $0.300/M output. Coding ELO: 1170. Speed: 180 tok/s.

Llama 4 Scout (Meta): $0.100/M input, $0.300/M output. Coding ELO: 1230. Speed: 110 tok/s.

Mistral Small (Mistral): $0.100/M input, $0.300/M output. Coding ELO: 1160. Speed: 120 tok/s.

Gemini 2.0 Flash (Google): $0.100/M input, $0.400/M output. Coding ELO: 1240. Speed: 160 tok/s.

For simple code generation and autocomplete, smaller models like GPT-4.1 Nano or Gemini 2.0 Flash Lite are extremely affordable. For complex multi-file refactoring and architecture decisions, investing in Claude Sonnet 4 or GPT-4.1 pays off in fewer iterations and better results.

Cost-saving tips for coding workloads: Use prompt caching for system prompts with coding instructions. Batch non-urgent code reviews through batch APIs. Start with a cheaper model and only escalate to premium models for complex tasks.

What's the Cheapest LLM for Coding?

Related Tools

Related Questions

How Much Does Claude API Cost?

ChatGPT vs Claude: Which Is Better?

Best LLM API for Production Use

LLM API Pricing Comparison — Complete Guide

How to Reduce LLM API Costs

Which LLM Has the Largest Context Window?

Fastest LLM API — Speed Comparison