Cheapest LLM APIs (2026)

The most affordable large language model APIs ranked by price per million tokens — ideal for high-volume workloads, prototyping, and cost-sensitive production apps.

Why Gemini 2.0 Flash Lite is Best for Cheapest LLM APIs

Gemini 2.0 Flash Lite ranks highest for this use case based on Arena ELO score, benchmark performance, and capability coverage. It provides the best combination of quality, speed, and reliability for these specific tasks.

Cost Estimate

For a typical workload (~50M tokens/month, 60% input / 40% output), the cheapest qualifying model (Gemini 2.0 Flash Lite) costs approximately $8.25/month. The most capable model may cost more but delivers higher quality results.

Price vs Quality for Cheapest LLM APIs

Log scale (price)

Anthropic

Google

Top 5 Models Compared

Rank	Model	Provider	Input $/M	Output $/M	Arena ELO	Speed (tok/s)
#1	Gemini 2.0 Flash Lite	Google	$0.075	$0.300	1200	180
#2	GPT-4.1 Nano	OpenAI	$0.100	$0.400	1180	150
#3	GPT-4.1 Mini	OpenAI	$0.400	$1.60	1240	115
#4	Claude Haiku 4	Anthropic	$1.00	$5.00	1220	130
#5	Gemini 2.0 Flash	Google	$0.100	$0.400	1260	160

#1Gemini 2.0 Flash Lite

Google

ELO 1200

Input

$0.075/M

Output

$0.300/M

VisionJSON ModeFunctionsMultimodal

View details Compare

#2GPT-4.1 Nano

OpenAI

ELO 1180

Input

$0.100/M

Output

$0.400/M

VisionJSON ModeFunctionsMultimodal

View details Compare

#3GPT-4.1 Mini

OpenAI

ELO 1240

Input

$0.400/M

Output

$1.60/M

VisionJSON ModeFunctionsMultimodal

View details Compare

#4Claude Haiku 4

Anthropic

ELO 1220

Input

$1.00/M

Output

$5.00/M

VisionJSON ModeFunctionsMultimodal

View details Compare

#5Gemini 2.0 Flash

Google

ELO 1260

Input

$0.100/M

Output

$0.400/M

VisionJSON ModeFunctionsMultimodalCode Exec

View details Compare

#6Llama 4 Scout

Cheapest LLM APIs (2026)

Why Gemini 2.0 Flash Lite is Best for Cheapest LLM APIs

Cost Estimate

Price vs Quality for Cheapest LLM APIs

Top 5 Models Compared

Other Categories