LLM Pricing Calculator — Compare AI API Costs (GPT, Claude, Gemini, Gemma) [2026]
Compare AI API costs across providers. Enter token counts to see pricing for OpenAI GPT-4o, Anthropic Claude, Google Gemini, Mistral, DeepSeek and more. Calculate monthly AI spend. Free cost calculator.
Cheapest (Cloud)
$0.00/mo
Gemma 4 (self-hosted)
Most Expensive
$157.50/mo
Claude Opus 4
Potential Monthly Savings
$156.75
vs cheapest paid option
| Model | Provider | Input $/1M | Output $/1M | Cost/Request | Monthly Cost |
|---|---|---|---|---|---|
| Gemma 4 (self-hosted)Open Source | Self-Hosted | $0 | $0 | $0.00 | $0.00 |
| Llama 3 (self-hosted)Open Source | Self-Hosted | $0 | $0 | $0.00 | $0.00 |
| Mistral Small | Mistral | $0.10 | $0.30 | $0.0003 | $0.750 |
| Gemini 2.0 Flash | $0.10 | $0.40 | $0.0003 | $0.900 | |
| GPT-4o Mini | OpenAI | $0.15 | $0.60 | $0.0004 | $1.35 |
| DeepSeek V3 | DeepSeek | $0.27 | $1.10 | $0.0008 | $2.46 |
| GPT-4.1 Mini | OpenAI | $0.40 | $1.60 | $0.0012 | $3.60 |
| DeepSeek R1 | DeepSeek | $0.55 | $2.19 | $0.0016 | $4.94 |
| Claude Haiku 3.5 | Anthropic | $0.80 | $4.00 | $0.0028 | $8.40 |
| o4-mini | OpenAI | $1.10 | $4.40 | $0.0033 | $9.90 |
| Mistral Large | Mistral | $2.00 | $6.00 | $0.0050 | $15.00 |
| GPT-4.1 | OpenAI | $2.00 | $8.00 | $0.0060 | $18.00 |
| Gemini 2.5 Pro | $1.25 | $10.00 | $0.0063 | $18.75 | |
| GPT-4o | OpenAI | $2.50 | $10.00 | $0.0075 | $22.50 |
| Claude Sonnet 4 | Anthropic | $3.00 | $15.00 | $0.010 | $31.50 |
| o3 | OpenAI | $10.00 | $40.00 | $0.030 | $90.00 |
| Claude Opus 4 | Anthropic | $15.00 | $75.00 | $0.052 | $157.50 |
Prices reflect published API rates as of early 2026 and may change. Self-hosted models have $0 API cost but require GPU hardware. Use our AI VRAM Calculator to estimate hardware requirements.
What is LLM Pricing Calculator?
How to Use LLM Pricing Calculator
Enter the number of input tokens and output tokens per request, then set how many requests you make per day. The calculator instantly shows the cost per request and estimated monthly cost for every major AI model. Filter by provider to focus on specific APIs. Sort by cost to find the cheapest option. The cheapest model is highlighted in green. For self-hosted models like Gemma 4, API cost is $0 — use our VRAM Calculator to estimate hardware costs instead.
How LLM Pricing Calculator Works
Common Use Cases
- Comparing API costs across OpenAI, Anthropic, Google, DeepSeek, and Mistral before committing to a provider
- Estimating monthly AI spend for a new product or feature before launch
- Finding the cheapest model that meets your quality requirements for production workloads
- Budgeting AI infrastructure costs for startups and small teams
- Evaluating whether to switch from an expensive model to a cheaper alternative
- Calculating the cost difference between using cloud APIs vs self-hosting open-source models like Gemma 4
Frequently Asked Questions
Which AI model is the cheapest?▼
For cloud APIs, Google Gemini 2.0 Flash ($0.10/$0.40 per million tokens) and DeepSeek V3 ($0.27/$1.10) are among the cheapest. For zero API cost, self-hosted open-source models like Gemma 4, Llama 3, and Mistral are free to run — you only pay for GPU hardware.
Why are output tokens more expensive than input tokens?▼
Generating output text (completions) requires sequential computation — the model predicts one token at a time, each depending on the previous one. Processing input tokens can be parallelized across the GPU. This makes output generation 2–5x more compute-intensive, which is reflected in higher per-token pricing.
How accurate are the prices shown?▼
Prices reflect published API rates as of early 2026. Providers occasionally update pricing — usually downward. Always verify current rates on each provider's pricing page before committing to large-scale usage. The relative cost rankings rarely change significantly.
Should I self-host instead of using APIs?▼
Self-hosting makes sense if you have high, consistent volume (100K+ requests/day), need data privacy, or already own GPU hardware. For most startups and moderate workloads, cloud APIs are cheaper and simpler. Use our AI VRAM Calculator to estimate hardware costs for self-hosting.
What is the difference between per-token and per-request pricing?▼
Most AI APIs charge per token (per million tokens). A few services offer per-request pricing (flat rate regardless of length). Per-token is more common and more predictable for variable-length inputs. This calculator uses per-token pricing as that is the industry standard.
How can I reduce my AI API costs?▼
Route simple tasks to cheaper models (use GPT-4o Mini or Gemini Flash instead of Opus). Cache repeated prompts. Use batch APIs for non-real-time tasks (typically 50% cheaper). Reduce prompt length by removing unnecessary context. Set max_tokens limits to avoid runaway output costs. Consider fine-tuning a smaller model for specific tasks.
Related Tools
AI VRAM Calculator
Calculate VRAM needed to run AI models locally. Select model size, quantization ...
AI Token Counter
Paste text → instantly count tokens for GPT-4o, Claude, Gemini & more. See API c...
JSON Schema Generator
Paste JSON and get a valid JSON Schema instantly. Detects types, required fields...
Explore More Free Tools
Discover more tools from our network — all free, browser-based, and privacy-first.