Written by Espen, who helps business owners install AI systems using open-source tools. What is a Chief AI Officer? →
AI PricingAnthropic Claude API Pricing in February 2026 — Every Model, Every Tier
The complete, no-fluff breakdown of what Claude actually costs when you call it through the API. Opus 4.6, Sonnet 4.5, Haiku 4.5 — with real numbers, real-world cost examples, and comparisons to OpenAI and Google.
Claude API pricing in February 2026: Opus 4.6 costs $5/$25 per million tokens (input/output), Sonnet 4.5 costs $3/$15, and Haiku 4.5 costs $1/$5. Prompt caching cuts input costs by 90%. Batch processing saves 50%. Those are the numbers — the rest of this guide explains what they mean for your wallet.
Big update: Anthropic dropped Opus pricing by 67% with the 4.6 release. The old Opus 4 cost $15/$75 — the new Opus 4.6 costs just $5/$25. If you're still on Opus 4, switching to 4.6 saves you massive money and gets you a better model.
This guide covers the Anthropic API specifically — the developer-facing, pay-per-token service you access through api.anthropic.com. If you're looking for Claude Code (the CLI tool) pricing, see our Claude Code pricing guide. For OpenClaw costs, check the OpenClaw pricing guide.
Claude API Pricing 2026: All Models at a Glance
Anthropic offers three model tiers through the API. Each targets a different use case and budget. Note: prompts over 200K tokens cost more on Opus and Sonnet.
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context Window | Best For |
|---|---|---|---|---|
| Claude Opus 4.6 | $5.00 | $25.00 | 200K | Complex reasoning, agents, coding |
| Claude Sonnet 4.5 | $3.00 | $15.00 | 200K | Best balance of cost & quality |
| Claude Haiku 4.5 | $1.00 | $5.00 | 200K | High-volume, low-latency tasks |
Long-Prompt Pricing (Over 200K Tokens)
If your prompts exceed 200K tokens, Opus 4.6 and Sonnet 4.5 charge more:
| Model | Input >200K | Output >200K |
|---|---|---|
| Opus 4.6 | $10.00 | $37.50 |
| Sonnet 4.5 | $6.00 | $22.50 |
| Haiku 4.5 | No tiered pricing — $1/$5 at all sizes | |
Which model should you pick? For most API use cases, Sonnet 4.5 is the sweet spot. It costs 40% less than Opus on input and output, while handling the vast majority of tasks at near-Opus quality. The gap has narrowed with the 4.6 generation — Opus 4.6 at $5/$25 is now affordable enough for production use cases that need maximum intelligence. Haiku 4.5 is your pick for high-volume classification, routing, and extraction.
Prompt Caching: The Biggest Cost Saver
Prompt caching is the single most impactful way to reduce your Claude API bill. If you're sending the same system prompt, few-shot examples, or documents repeatedly, you're leaving money on the table.
| Model | Standard Input | Cache Write (+25%) | Cache Read (−90%) |
|---|---|---|---|
| Opus 4.6 | $5.00 | $6.25 | $0.50 |
| Sonnet 4.5 | $3.00 | $3.75 | $0.30 |
| Haiku 4.5 | $1.00 | $1.25 | $0.10 |
Here's how it works: the first time you send a prompt with caching enabled, you pay a 25% surcharge to write the content into the cache. Every subsequent request that reuses that cached content pays only 10% of the normal input price. The cache has a 5-minute TTL (time-to-live) that refreshes with each hit.
Prompt Caching: Before vs. After
Let's say you run a chatbot with a 4,000-token system prompt, handling 100 requests/hour on Sonnet 4.5:
Without Caching
100 requests × 4,000 tokens × $3.00/MTok = $1.20/hour → $876/month
With Caching
1 cache write: 4,000 tokens × $3.75/MTok = $0.015
99 cache reads: 99 × 4,000 tokens × $0.30/MTok = $0.119/hour → $87/month
Savings: 90% ($789/month saved)
When caching saves you money
- Chatbot with a long system prompt — Cache the system prompt. Every user message only pays the cache-read rate on that portion.
- RAG with fixed context — Cache your retrieved documents when the same docs serve multiple queries.
- Few-shot classification — Cache your examples. Only the new input is charged at full price.
💡 Pro tip
At scale, prompt caching on Sonnet 4.5 brings your effective input cost down to $0.30/MTok — that's 70% cheaper than Haiku 4.5's standard rate of $1.00/MTok. Flagship-quality responses at below-budget-model prices.
Batch Processing: 50% Off Everything
If your workload doesn't need real-time responses, the Message Batches API gives you a flat 50% discount on both input and output tokens. Batches are processed within 24 hours.
| Model | Batch Input (per 1M) | Batch Output (per 1M) |
|---|---|---|
| Opus 4.6 | $2.50 | $12.50 |
| Sonnet 4.5 | $1.50 | $7.50 |
| Haiku 4.5 | $0.50 | $2.50 |
Batch processing is ideal for data labeling, content generation pipelines, bulk summarization, and any workflow where you can queue requests and process results later.
Extended Thinking Costs
Claude's extended thinking feature (where the model "thinks" step-by-step before responding) uses additional tokens. These thinking tokens are charged at the output token rate — which is the expensive side.
For Sonnet 4.5, that means thinking tokens cost $15/MTok. For Opus 4.6, $25/MTok. This adds up fast on complex reasoning tasks. A request that generates 5,000 thinking tokens on Opus 4.6 costs an extra $0.125 just for the reasoning.
Watch your thinking budget. Set budget_tokens in the thinking parameter to cap how many tokens Claude spends reasoning. Without a cap, complex prompts can generate tens of thousands of thinking tokens.
Claude vs OpenAI API Cost: February 2026 Comparison
Here's how Claude stacks up against the competition in February 2026. We're comparing models at roughly equivalent capability tiers.
Flagship / Premium Tier
| Model | Input (per 1M) | Output (per 1M) | Cached Input |
|---|---|---|---|
| Claude Opus 4.6 | $5.00 | $25.00 | $0.50 |
| GPT-5.2 Pro | $21.00 | $168.00 | — |
| Gemini 3 Pro | $2.00 | $12.00 | $0.20 |
The pricing landscape shifted dramatically with Opus 4.6. At $5/$25, it's now 76% cheaper than GPT-5.2 Pro on input and 85% cheaper on output. Gemini 3 Pro remains the cheapest flagship option, but the gap has narrowed considerably.
Workhorse / Mid Tier
| Model | Input (per 1M) | Output (per 1M) | Cached Input |
|---|---|---|---|
| Claude Sonnet 4.5 | $3.00 | $15.00 | $0.30 |
| GPT-5.2 | $1.75 | $14.00 | $0.175 |
| Gemini 3 Flash | $0.50 | $3.00 | $0.05 |
This is the tier most developers care about. GPT-5.2 edges out Sonnet 4.5 on input pricing ($1.75 vs $3.00) and is roughly comparable on output ($14 vs $15). Gemini 3 Flash undercuts both significantly — but each model has different strengths in coding, reasoning, and instruction-following.
Budget / Speed Tier
| Model | Input (per 1M) | Output (per 1M) | Cached Input |
|---|---|---|---|
| Claude Haiku 4.5 | $1.00 | $5.00 | $0.10 |
| GPT-5 mini | $0.25 | $2.00 | $0.025 |
| Gemini 3 Flash | $0.50 | $3.00 | $0.05 |
At the budget tier, GPT-5 mini is the cheapest option. Haiku 4.5 is the most expensive of the three — but it consistently delivers higher quality output. For high-volume classification, extraction, or routing tasks, test all three on your specific workload before choosing based on price alone.
Price isn't everything. A model that costs 2x more but needs half the retries is actually cheaper. Always benchmark on your actual tasks. Claude Sonnet 4.5 consistently excels at instruction-following and coding — if those are your use cases, the modest premium over GPT-5.2 often pays for itself in fewer failed completions.
Real-World Cost Examples
Abstract token prices are hard to reason about. Here are five specific production scenarios with exact math, all using Sonnet 4.5:
1. Customer Support Chatbot — 1,000 Conversations/Day
Average conversation: 3,000-token system prompt (cached) + 500 input tokens + 300 output tokens, 5 turns per conversation.
System prompt: 3,000 × $0.30/MTok = $0.0009/conversation (cached)
User messages: 5 turns × 500 tokens × $3.00/MTok = $0.0075
I put together a free breakdown of the full setup I use — tools, costs, and how it all connects. Get it here if you want to see real numbers.
Responses: 5 turns × 300 tokens × $15.00/MTok = $0.0225
Per conversation: $0.031 × 1,000/day × 30 days
Monthly cost: ~$930/month ($31/day)
2. Document Summarization Pipeline — 500 Documents/Day
Average document: 8,000 input tokens, 600 output tokens. Batch processing (50% off).
Input: 500 × 8,000 × $1.50/MTok = $6.00/day
Output: 500 × 600 × $7.50/MTok = $2.25/day
Monthly cost: ~$248/month ($8.25/day) — batch saves $248 vs real-time
3. Code Review Agent — 200 PRs/Day
Average PR diff: 3,000 tokens input + 2,000-token system prompt (cached) + 1,500 output tokens.
System prompt: cached at $0.30/MTok → negligible
Input: 200 × 3,000 × $3.00/MTok = $1.80/day
Output: 200 × 1,500 × $15.00/MTok = $4.50/day
Monthly cost: ~$189/month ($6.30/day)
4. Email Triage System — 2,000 Emails/Day
Classify and prioritize emails. Average: 800 input tokens + 100 output tokens. Using Haiku 4.5 for speed.
Input: 2,000 × 800 × $1.00/MTok = $1.60/day
Output: 2,000 × 100 × $5.00/MTok = $1.00/day
Monthly cost: ~$78/month ($2.60/day) — perfect Haiku use case
5. Content Generation Workflow — 100 Articles/Day
Generate 1,500-word blog posts. Input: 1,000-token prompt + 3,000-token brief. Output: ~2,000 tokens.
Input: 100 × 4,000 × $3.00/MTok = $1.20/day
Output: 100 × 2,000 × $15.00/MTok = $3.00/day
Monthly cost: ~$126/month ($4.20/day) — $1.26 per article
Cost Optimization Cheatsheet
🚀 Quick Wins for Cutting Your Claude API Bill
- Enable prompt caching → saves up to 90% on repeated input tokens
- Use batch API for async work → flat 50% off input and output
- Downgrade to Haiku 4.5 for simple tasks → 67% cheaper than Sonnet on input, 67% cheaper on output
- Cap extended thinking with
budget_tokens→ prevents runaway reasoning costs - Keep system prompts concise → every token is charged on every request (unless cached)
- Set
max_tokens→ cap output length to avoid verbose responses - Switch from Opus 4 to Opus 4.6 → 67% cheaper and better performance
- Combine caching + batching → stack discounts for maximum savings
- Monitor usage daily → set alerts in the Anthropic dashboard before surprises
- Use Sonnet 4.5 with caching → effective $0.30/MTok input, cheaper than Haiku standard rate
Claude API vs Claude Pro vs Claude Max: Which Should You Use?
Anthropic offers three ways to use Claude. Here's when each makes sense:
| Option | Price | Best For | Access |
|---|---|---|---|
| Claude API | Pay-per-token | Building apps, automation, pipelines | Programmatic (REST API) |
| Claude Pro | $20/month ($17/mo annual) | Personal productivity, research, writing | Web, mobile, desktop app |
| Claude Max | From $100/month | Power users, heavy Claude Code usage | Web, mobile, desktop + Claude Code |
When Claude Pro ($20/mo) beats the API
If you're an individual using Claude for writing, research, brainstorming, and analysis through the chat interface, Pro is almost certainly cheaper. Rough rule of thumb: if you'd use fewer than ~700,000 output tokens per month on Sonnet 4.5 via the API ($10.50), Pro's unlimited-ish access at $20/month is the better deal because you also get Opus access, projects, memory, and Claude Code.
When Claude Max ($100–$200/mo) makes sense
Max is for power users who hit Pro's usage limits regularly — especially heavy Claude Code users. The $100/month tier gives you 5x Pro usage, and $200/month gives you 20x. If you're a developer using Claude Code for hours daily, Max pays for itself vs. API costs for equivalent token volume.
When the API wins
The API is the only choice when you need: programmatic access, custom system prompts, tool use, automated pipelines, batch processing, or integration into your own products. There's no substitute for the API when you're building software.
Claude API Free Tier
Anthropic does not offer a free API tier with ongoing free tokens. However:
- Free claude.ai usage — You can use Claude for free through the web interface with limited daily messages. No API access.
- Pay-as-you-go API — The API charges only for tokens consumed. No minimum spend. You can start with just a few dollars of credits.
- Usage tier system — New accounts start at tier 1 with lower rate limits. Limits increase as you spend more ($0 → $4,000+ monthly spend tiers).
If you want to test Claude without spending money, use the free web tier first. When you're ready to build, the API has no setup fees — you pay only for what you use.
Legacy Model Pricing
If you're still using older Claude models, here's what they cost. Consider upgrading — newer models are often cheaper and better:
| Legacy Model | Input (per 1M) | Output (per 1M) | Notes |
|---|---|---|---|
| Opus 4.1 / Opus 4 | $15.00 | $75.00 | 3x more expensive than Opus 4.6 |
| Opus 4.5 | $5.00 | $25.00 | Same price as Opus 4.6 |
| Sonnet 4 | $3.00 | $15.00 | Same price as Sonnet 4.5 |
| Haiku 3 | $0.25 | $1.25 | Cheapest option, but less capable |
OpenClaw + Claude API: Efficient AI Automation
OpenClaw is a personal AI automation platform that connects to the Claude API (among other providers) to run your AI workflows 24/7. Here's how it keeps costs efficient:
- Smart model routing — OpenClaw can use cheaper models (Haiku) for simple tasks and upgrade to Sonnet or Opus only when complexity requires it.
- Prompt caching by default — System prompts and agent context are cached automatically, saving up to 90% on repeated tokens.
- Batching where possible — Background tasks like email triage and content generation use the batch API for 50% savings.
- Token-aware scheduling — OpenClaw tracks token usage and can throttle or schedule tasks to stay within budget.
- Multi-provider support — Route tasks to the cheapest capable model across Anthropic, OpenAI, and Google.
A typical OpenClaw setup running email monitoring, calendar management, and daily briefings on Sonnet 4.5 costs roughly $15–40/month in API fees — less than a Claude Pro subscription, but with full automation capabilities.
How to Access the Claude API
Getting started takes about 2 minutes:
- Create an account at console.anthropic.com
- Add a payment method — credit card or prepaid credits
- Generate an API key in the dashboard
- Set spending limits — Anthropic lets you set hard monthly caps
- Make your first call:
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{
"model": "claude-sonnet-4-5-20241219",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello, Claude"}]
}'
Anthropic uses a prepaid/postpaid billing model. You can buy credits upfront or get billed monthly. There are usage-based rate limits that increase as you spend more (tiers range from $0 to $4,000+ in monthly spend).
What's Changed Recently
Anthropic has been actively adjusting pricing and releasing new models. Key recent changes:
- Claude Opus 4.6 launched — The flagship model now costs $5/$25, a 67% price drop from Opus 4's $15/$75
- Claude Sonnet 4.5 launched — Maintains $3/$15 pricing with improved performance
- Claude Haiku 4.5 launched — Upgraded from Haiku 3.5 ($0.80/$4) to $1/$5, slightly more expensive but significantly more capable
- Tiered pricing for long prompts — Prompts over 200K tokens now cost 2x on input (1.5x on output) for Opus and Sonnet
- Extended thinking generally available across all models
- Web search tool — $10 per 1,000 searches, new capability
- Code execution — 50 free hours/day per org, then $0.05/hour
Bottom Line
For most developers building with the Claude API in February 2026, the play is clear: Sonnet 4.5 with prompt caching. It gives you excellent quality at an effective input rate that's cheaper than Haiku 4.5's standard pricing. Use Haiku 4.5 for high-volume, simple tasks where you don't need caching. And the new Opus 4.6 at $5/$25 is finally affordable enough for production — consider it for agent workflows and complex coding tasks.
Compared to OpenAI and Google, Claude's pricing is competitive. Opus 4.6's price cut makes it dramatically cheaper than GPT-5.2 Pro. At the mid tier, Sonnet 4.5 is slightly pricier than GPT-5.2 on input but comparable on output. Google's Gemini remains the budget king. The real differentiator isn't price — it's which model performs best on your specific tasks.
Free: The AI Growth Breakdown
See how one business went from 0 to 600 daily visitors in 14 days using AI. The exact tools and results.
Get the Free Breakdown →