Want an AI that works for you 24/7? Get the Free Blueprint href="/blueprint">Meet your Chief AI Officer →rarr;
Guide

How to Reduce Claude Code Costs

Practical strategies to cut your Claude Code spending by up to 80% without sacrificing results. Token optimization, model selection, and smart usage patterns.

Updated February 10, 2026 · 13 min read

To reduce Claude Code costs, switch to Haiku for simple tasks with /model haiku, compress long conversations with /compact, and keep your CLAUDE.md file concise. These three changes alone can cut spending by 40-80%. You can track costs in real time with the /cost command.

Most Claude Code users overspend by 40-60% because they run every task on the default model and let context windows grow unchecked. This guide covers every optimization strategy, from model selection to token management.

New to Claude Code? Watch the free CAIO Blueprint to see it in action.

Understanding Token Usage

Before optimizing, you need to understand what you're paying for. Claude Code charges based on tokens - the fundamental unit of text processing.

What's a Token?

Where Tokens Add Up

Source Token Impact Hidden Cost?
Your prompts Low No
Claude's responses Medium No
Conversation history High Yes
CLAUDE.md file Medium Yes
File contents read High Yes

The hidden costs are what kill most budgets. Every message you send includes your entire conversation history plus your CLAUDE.md file. A 50-message conversation means that context gets sent 50 times.

Key insight: Token costs compound. A 10,000-token conversation history gets resent with every message. After 20 exchanges, you've paid for 200,000 tokens of history alone.

Tracking Your Costs: /cost and /usage

You can't optimize what you don't measure. Claude Code has built-in commands to track spending.

The /cost Command

Type /cost at any time to see your current session's usage:

/cost

Session Usage:
  Input tokens:  45,230
  Output tokens: 12,450
  Estimated cost: $0.23

Breakdown:
  - Context/history: 38,000 tokens
  - Your messages: 4,230 tokens
  - Files read: 3,000 tokens

This shows you exactly where tokens are going. If context/history is dominating, it's time to use /compact or /clear.

The /usage Command

For a broader view of your usage patterns:

/usage

This Week:
  Total tokens: 1,245,000
  Estimated cost: $5.87
  Sessions: 23
  Avg per session: $0.26

Model breakdown:
  - Sonnet: 89% of usage
  - Haiku: 11% of usage

If you're using Sonnet for everything, you're likely overspending. Many tasks can run on cheaper models.

Choosing the Right Model

Claude Code can use different Claude models. Each has different capabilities and costs:

Model Cost Comparison

Model Input Cost (per 1M) Output Cost (per 1M) Best For
Claude Haiku 4.5 $0.80 $4.00 Simple tasks, quick questions
Claude Sonnet 4.5 $3.00 $15.00 Complex coding, analysis (default)
Claude Opus 4.6 $15.00 $75.00 Most demanding reasoning tasks
Savings example: Haiku costs nearly 4x less for input and 4x less for output compared to Sonnet. Switching simple tasks to Haiku can cut that portion of your bill by 70-80%.

When to Use Each Model

Use Haiku 4.5 ($0.80/1M input) for:

Use Sonnet 4.5 ($3/1M input) for:

Use Opus 4.6 ($15/1M input) for:

Switching Models

Use the /model command to switch:

/model haiku
# Now using Claude Haiku 4.5

/model sonnet
# Now using Claude Sonnet 4.5

/model opus
# Now using Claude Opus 4.6

Context Management: /compact and /clear

Context management is the single biggest lever for cost reduction. Here's how to use it effectively.

The /compact Command

/compact summarizes your conversation history into a condensed form. Instead of sending the full history with each message, Claude works with a summary.

# Before /compact
Context size: 45,000 tokens

/compact

# After /compact
Context size: 8,000 tokens
(Savings: 82%)

Use /compact when:

The /clear Command

/clear wipes your conversation history entirely. Use it when:

/clear

Conversation cleared. Starting fresh.
Caution: /clear is permanent. If you need to reference previous work, use /compact instead, which preserves key information.

Strategic Context Breaks

Instead of one long session, break work into focused chunks:

  1. Task 1: Plan the feature (then /clear)
  2. Task 2: Implement component A (then /clear)
  3. Task 3: Implement component B (then /clear)
  4. Task 4: Integration and testing

Each chunk starts fresh, avoiding the compound cost of growing context.

CLAUDE.md Optimization

Your CLAUDE.md file is included with every single request. A bloated CLAUDE.md silently inflates every task.

Common CLAUDE.md Bloat

Problem Token Cost Fix
Paragraph-style instructions High Use bullet points
Redundant examples Medium One example per concept
Full file contents Very High Reference file paths instead
Outdated instructions Medium Regular cleanup

Before and After Example

Before (bloated - ~500 tokens):

# Project Guidelines

When working on this project, please always remember that we use
TypeScript for all of our frontend code. We have a specific coding
style that involves using functional components with hooks rather
than class components. All components should be placed in the
components directory and should have their own folder with an
index.ts file that exports the component...

[continues for several more paragraphs]

After (optimized - ~100 tokens):

# Project Guidelines

- TypeScript only
- Functional components with hooks
- Components in /components/{Name}/index.ts
- Tests in __tests__/
- Run `npm test` before committing
Impact: If you make 50 requests per day, a 400-token reduction saves 20,000 tokens daily - about $0.30/day on Sonnet, or $9/month.

Prompt Caching and Batch API

Two powerful cost-reduction features that many users overlook:

Prompt caching: Cached content costs only 10% of the original price after the first request. If you're making repeated requests with the same context (like a large CLAUDE.md or repeated file contents), caching saves up to 90% on those cached reads.

Batch API: If you're running automated tasks or batch operations, the Batch API offers a flat 50% discount on all models. This is ideal for scheduled processing jobs, bulk code analysis, or any workflow where you don't need real-time responses.

Subscription vs. API: When to Switch

Your pricing model choice significantly impacts costs. Anthropic offers several subscription tiers for Claude Code users.

Subscription Plans

Plan Price Claude Code Access Best For
Claude Pro $20/month 5x Free capacity, all models including Opus 4.6 Moderate users
Claude Max 5x $100/month 5x Pro capacity, Opus 4.6 with 1M context, priority access Daily power users
Claude Max 20x $200/month 20x Pro capacity, effectively unlimited for most users Professional developers

API vs. Subscription Comparison

Monthly Usage API Cost (Sonnet 4.5) Best Subscription Winner
Light (500K tokens) ~$3-5 Pro ($20) API
Moderate (2M tokens) ~$15-20 Pro ($20) Tie
Heavy (5M+ tokens) ~$40+ Pro ($20) Pro
Power (daily Opus usage) ~$150+ Max 5x ($100) Max

When API Makes Sense

When Max Subscription Makes Sense

Practical Cost-Saving Strategies

1. Front-Load Your Thinking

Instead of iterating with Claude, think through your requirements first. One well-crafted prompt costs far less than five back-and-forth refinements.

# Expensive approach (5 exchanges):
"Create a script"
"Add error handling"
"Make it handle CSV files"
"Add logging"
"Fix the output format"

# Cheaper approach (1-2 exchanges):
"Create a Python script that:
- Processes CSV files from /data folder
- Handles missing values and malformed rows
- Logs progress to console
- Outputs cleaned data to /output
- Include error handling for file access issues"

2. Batch Related Tasks

Instead of separate sessions for related work, batch them together while context is fresh:

3. Use Specific File References

Instead of letting Claude read entire directories, point to specific files:

# Expensive:
"Look at my project and fix the authentication"

# Cheaper:
"Fix the login bug in src/auth/login.ts - the token validation fails on line 45"

4. Set Output Constraints

Ask for concise responses when appropriate:

"Give me a brief answer: what's wrong with this function?"

"List just the file names that need changes, not the full code"

"Explain in 2-3 sentences why this approach is better"

Common Mistakes to Avoid

1. Never Using /compact

Long sessions without compacting can cost 5-10x more than necessary. Get in the habit of compacting every 10-15 exchanges.

2. Using Opus for Everything

Opus is 5x more expensive than Sonnet and 60x more than Haiku. Reserve it for tasks that truly need it.

3. Ignoring CLAUDE.md Size

A 2,000-token CLAUDE.md adds $0.006 to every Sonnet request. Over 1,000 requests, that's $6 in hidden costs.

4. Not Tracking Costs

If you're not regularly checking /cost and /usage, you're flying blind. Make it a weekly habit.

5. Staying on Wrong Pricing Plan

Review your usage monthly. If API costs consistently exceed $20, switch to Pro. If Pro limits frustrate you, Claude Max ($100/month) provides 5x the capacity. For professional developers using Claude Code daily, Max 20x ($200/month) is often the best value.

6. Missing Out on Caching and Batch Discounts

Prompt caching can save up to 90% on repeated context, and the Batch API gives a flat 50% discount. If you're not using either, you're leaving money on the table.

Frequently Asked Questions

How do I check my Claude Code usage and costs?

Use the /cost command to see your current session's token usage and estimated cost. Use /usage for a broader view of your usage patterns over time. Both commands help you track spending in real-time.

What's the cheapest Claude model for simple tasks?

Claude Haiku 4.5 is the most cost-effective model at $0.80 per million input tokens and $4 per million output tokens. It's nearly 4x cheaper than Sonnet 4.5 for input and perfect for simple tasks like file renaming, basic scripts, and straightforward questions.

How does /compact reduce Claude Code costs?

The /compact command summarizes your conversation history into a condensed form, reducing the token count that gets sent with each new message. This can cut costs by 40-60% in long sessions while preserving essential context.

Is a Claude subscription cheaper than API?

For moderate users, Claude Pro ($20/month) is typically cheaper. The crossover point is around $15-20/month in API usage. For power users, Claude Max ($100/month for 5x capacity or $200/month for 20x capacity) provides predictable costs and priority access to Opus 4.6.

How do I optimize CLAUDE.md to reduce costs?

Keep your CLAUDE.md file concise and focused. Remove redundant instructions, use bullet points instead of paragraphs, and only include essential project context. A bloated CLAUDE.md adds to every single request's token count.

Like Claude Code? Meet Your Chief AI Officer

Watch our free training to see efficient Claude Code usage in action. Learn the workflows that minimize costs while maximizing output.

Get the Free Blueprint href="/blueprint" class="cta-btn">Watch the Free Setup Video →rarr;