How to Reduce Claude Code Costs
Practical strategies to cut your Claude Code spending by up to 80% without sacrificing results. Token optimization, model selection, and smart usage patterns.
To reduce Claude Code costs, switch to Haiku for simple tasks with /model haiku, compress long conversations with /compact, and keep your CLAUDE.md file concise. These three changes alone can cut spending by 40-80%. You can track costs in real time with the /cost command.
Most Claude Code users overspend by 40-60% because they run every task on the default model and let context windows grow unchecked. This guide covers every optimization strategy, from model selection to token management.
Understanding Token Usage
Before optimizing, you need to understand what you're paying for. Claude Code charges based on tokens - the fundamental unit of text processing.
What's a Token?
- Roughly 4 characters or 0.75 words
- "Hello world" = approximately 2 tokens
- Code is typically more token-dense than prose
- You pay for both input tokens (what you send) and output tokens (what Claude generates)
Where Tokens Add Up
| Source | Token Impact | Hidden Cost? |
|---|---|---|
| Your prompts | Low | No |
| Claude's responses | Medium | No |
| Conversation history | High | Yes |
| CLAUDE.md file | Medium | Yes |
| File contents read | High | Yes |
The hidden costs are what kill most budgets. Every message you send includes your entire conversation history plus your CLAUDE.md file. A 50-message conversation means that context gets sent 50 times.
Tracking Your Costs: /cost and /usage
You can't optimize what you don't measure. Claude Code has built-in commands to track spending.
The /cost Command
Type /cost at any time to see your current session's usage:
/cost
Session Usage:
Input tokens: 45,230
Output tokens: 12,450
Estimated cost: $0.23
Breakdown:
- Context/history: 38,000 tokens
- Your messages: 4,230 tokens
- Files read: 3,000 tokens
This shows you exactly where tokens are going. If context/history is dominating, it's time to use /compact or /clear.
The /usage Command
For a broader view of your usage patterns:
/usage
This Week:
Total tokens: 1,245,000
Estimated cost: $5.87
Sessions: 23
Avg per session: $0.26
Model breakdown:
- Sonnet: 89% of usage
- Haiku: 11% of usage
If you're using Sonnet for everything, you're likely overspending. Many tasks can run on cheaper models.
Choosing the Right Model
Claude Code can use different Claude models. Each has different capabilities and costs:
Model Cost Comparison
| Model | Input Cost (per 1M) | Output Cost (per 1M) | Best For |
|---|---|---|---|
| Claude Haiku 4.5 | $0.80 | $4.00 | Simple tasks, quick questions |
| Claude Sonnet 4.5 | $3.00 | $15.00 | Complex coding, analysis (default) |
| Claude Opus 4.6 | $15.00 | $75.00 | Most demanding reasoning tasks |
When to Use Each Model
Use Haiku 4.5 ($0.80/1M input) for:
- File renaming and organization
- Simple text transformations
- Quick factual questions
- Basic script generation
- Formatting tasks
Use Sonnet 4.5 ($3/1M input) for:
- Complex code generation
- Debugging intricate issues
- Multi-file refactoring
- Architecture decisions
- Nuanced analysis
Use Opus 4.6 ($15/1M input) for:
- The most complex reasoning tasks
- Critical business logic
- When Sonnet's output isn't sufficient
- Tasks requiring deep multi-step reasoning
Switching Models
Use the /model command to switch:
/model haiku
# Now using Claude Haiku 4.5
/model sonnet
# Now using Claude Sonnet 4.5
/model opus
# Now using Claude Opus 4.6
Context Management: /compact and /clear
Context management is the single biggest lever for cost reduction. Here's how to use it effectively.
The /compact Command
/compact summarizes your conversation history into a condensed form. Instead of sending the full history with each message, Claude works with a summary.
# Before /compact
Context size: 45,000 tokens
/compact
# After /compact
Context size: 8,000 tokens
(Savings: 82%)
Use /compact when:
- Your session has gone on for 10+ exchanges
/costshows context dominating your usage- You're switching to a new subtask within the same project
The /clear Command
/clear wipes your conversation history entirely. Use it when:
- Starting a completely new task
- Previous context is no longer relevant
- You want a "fresh start" with Claude
/clear
Conversation cleared. Starting fresh.
/clear is permanent. If you need to reference previous work, use /compact instead, which preserves key information.
Strategic Context Breaks
Instead of one long session, break work into focused chunks:
- Task 1: Plan the feature (then /clear)
- Task 2: Implement component A (then /clear)
- Task 3: Implement component B (then /clear)
- Task 4: Integration and testing
Each chunk starts fresh, avoiding the compound cost of growing context.
CLAUDE.md Optimization
Your CLAUDE.md file is included with every single request. A bloated CLAUDE.md silently inflates every task.
Common CLAUDE.md Bloat
| Problem | Token Cost | Fix |
|---|---|---|
| Paragraph-style instructions | High | Use bullet points |
| Redundant examples | Medium | One example per concept |
| Full file contents | Very High | Reference file paths instead |
| Outdated instructions | Medium | Regular cleanup |
Before and After Example
Before (bloated - ~500 tokens):
# Project Guidelines
When working on this project, please always remember that we use
TypeScript for all of our frontend code. We have a specific coding
style that involves using functional components with hooks rather
than class components. All components should be placed in the
components directory and should have their own folder with an
index.ts file that exports the component...
[continues for several more paragraphs]
After (optimized - ~100 tokens):
# Project Guidelines
- TypeScript only
- Functional components with hooks
- Components in /components/{Name}/index.ts
- Tests in __tests__/
- Run `npm test` before committing
Prompt Caching and Batch API
Two powerful cost-reduction features that many users overlook:
Prompt caching: Cached content costs only 10% of the original price after the first request. If you're making repeated requests with the same context (like a large CLAUDE.md or repeated file contents), caching saves up to 90% on those cached reads.
Batch API: If you're running automated tasks or batch operations, the Batch API offers a flat 50% discount on all models. This is ideal for scheduled processing jobs, bulk code analysis, or any workflow where you don't need real-time responses.
Subscription vs. API: When to Switch
Your pricing model choice significantly impacts costs. Anthropic offers several subscription tiers for Claude Code users.
Subscription Plans
| Plan | Price | Claude Code Access | Best For |
|---|---|---|---|
| Claude Pro | $20/month | 5x Free capacity, all models including Opus 4.6 | Moderate users |
| Claude Max 5x | $100/month | 5x Pro capacity, Opus 4.6 with 1M context, priority access | Daily power users |
| Claude Max 20x | $200/month | 20x Pro capacity, effectively unlimited for most users | Professional developers |
API vs. Subscription Comparison
| Monthly Usage | API Cost (Sonnet 4.5) | Best Subscription | Winner |
|---|---|---|---|
| Light (500K tokens) | ~$3-5 | Pro ($20) | API |
| Moderate (2M tokens) | ~$15-20 | Pro ($20) | Tie |
| Heavy (5M+ tokens) | ~$40+ | Pro ($20) | Pro |
| Power (daily Opus usage) | ~$150+ | Max 5x ($100) | Max |
When API Makes Sense
- You use Claude Code sporadically (few times per month)
- You primarily use Haiku for simple tasks
- You have strict budget controls needed
- You're building automated systems (consider Batch API for 50% off)
When Max Subscription Makes Sense
- You use Claude Code daily for serious development
- You want predictable monthly costs with no surprises
- You need regular access to Opus 4.6 for complex tasks
- You also use Claude.ai web interface and need priority access
- Max 20x ($200/month) feels effectively unlimited for single-user scenarios
Practical Cost-Saving Strategies
1. Front-Load Your Thinking
Instead of iterating with Claude, think through your requirements first. One well-crafted prompt costs far less than five back-and-forth refinements.
# Expensive approach (5 exchanges):
"Create a script"
"Add error handling"
"Make it handle CSV files"
"Add logging"
"Fix the output format"
# Cheaper approach (1-2 exchanges):
"Create a Python script that:
- Processes CSV files from /data folder
- Handles missing values and malformed rows
- Logs progress to console
- Outputs cleaned data to /output
- Include error handling for file access issues"
2. Batch Related Tasks
Instead of separate sessions for related work, batch them together while context is fresh:
- Create the function AND its tests in one session
- Write the component AND its styles together
- Build the API endpoint AND its documentation at once
3. Use Specific File References
Instead of letting Claude read entire directories, point to specific files:
# Expensive:
"Look at my project and fix the authentication"
# Cheaper:
"Fix the login bug in src/auth/login.ts - the token validation fails on line 45"
4. Set Output Constraints
Ask for concise responses when appropriate:
"Give me a brief answer: what's wrong with this function?"
"List just the file names that need changes, not the full code"
"Explain in 2-3 sentences why this approach is better"
Common Mistakes to Avoid
1. Never Using /compact
Long sessions without compacting can cost 5-10x more than necessary. Get in the habit of compacting every 10-15 exchanges.
2. Using Opus for Everything
Opus is 5x more expensive than Sonnet and 60x more than Haiku. Reserve it for tasks that truly need it.
3. Ignoring CLAUDE.md Size
A 2,000-token CLAUDE.md adds $0.006 to every Sonnet request. Over 1,000 requests, that's $6 in hidden costs.
4. Not Tracking Costs
If you're not regularly checking /cost and /usage, you're flying blind. Make it a weekly habit.
5. Staying on Wrong Pricing Plan
Review your usage monthly. If API costs consistently exceed $20, switch to Pro. If Pro limits frustrate you, Claude Max ($100/month) provides 5x the capacity. For professional developers using Claude Code daily, Max 20x ($200/month) is often the best value.
6. Missing Out on Caching and Batch Discounts
Prompt caching can save up to 90% on repeated context, and the Batch API gives a flat 50% discount. If you're not using either, you're leaving money on the table.
Frequently Asked Questions
How do I check my Claude Code usage and costs?
Use the /cost command to see your current session's token usage and estimated cost. Use /usage for a broader view of your usage patterns over time. Both commands help you track spending in real-time.
What's the cheapest Claude model for simple tasks?
Claude Haiku 4.5 is the most cost-effective model at $0.80 per million input tokens and $4 per million output tokens. It's nearly 4x cheaper than Sonnet 4.5 for input and perfect for simple tasks like file renaming, basic scripts, and straightforward questions.
How does /compact reduce Claude Code costs?
The /compact command summarizes your conversation history into a condensed form, reducing the token count that gets sent with each new message. This can cut costs by 40-60% in long sessions while preserving essential context.
Is a Claude subscription cheaper than API?
For moderate users, Claude Pro ($20/month) is typically cheaper. The crossover point is around $15-20/month in API usage. For power users, Claude Max ($100/month for 5x capacity or $200/month for 20x capacity) provides predictable costs and priority access to Opus 4.6.
How do I optimize CLAUDE.md to reduce costs?
Keep your CLAUDE.md file concise and focused. Remove redundant instructions, use bullet points instead of paragraphs, and only include essential project context. A bloated CLAUDE.md adds to every single request's token count.
Like Claude Code? Meet Your Chief AI Officer
Watch our free training to see efficient Claude Code usage in action. Learn the workflows that minimize costs while maximizing output.
Get the Free Blueprint href="/blueprint" class="cta-btn">Watch the Free Setup Video →rarr;