Claude Code compact usually refers to using Claude Code in a more token-efficient way so your coding session keeps the important context and drops or compresses the rest; if you are trying to save context tokens, the practical goal is to reduce what Claude has to carry forward on each turn. c-ai.chat is an independent guide to Claude, not Anthropic, and this page explains how compact-style workflows work, when to use them, and what trade-offs to expect. For the bigger picture, see our Claude Code guide.

- The short answer
- How it works
- What you’d actually do with it
- Vs. the alternatives
- Other questions readers ask
- The honest take
The short answer

Claude Code compact is a shorthand for trimming, summarising, or otherwise reducing the amount of conversation and code context Claude Code needs to keep in memory during a coding session. It is for developers who hit context limits, pay attention to token cost, or want Claude Code to stay focused on the current task instead of dragging along every previous file, error log, and dead-end idea.
- What it does · reduces carried context so sessions stay efficient
- Where it runs · in Claude Code workflows tied to Claude models and the Claude ecosystem
- What it costs · no separate compact fee; usage depends on your Claude plan or API token use
- Who it’s for · developers working on longer code sessions, larger repos, or multi-step debugging
There is no separate public Claude pricing line item called “compact.” The cost question is really about the model and access path you use. On the API side, Claude Sonnet 4.6 is priced at $3 per million input tokens and $15 per million output tokens, Claude Haiku 4.5 at $1 and $5, and Claude Opus 4.7 at $5 and $25. Prompt caching can cut cached input cost by 90%, which matters when you keep reusing the same codebase context. You can review official plan details on Claude pricing and broader options in our Claude pricing guide.
How it works

The basic mechanism is simple: Claude Code performs better when it sees the right context, not all context. In a long session, token usage grows because the model may need earlier instructions, file excerpts, diffs, stack traces, command output, and previous decisions. A compact workflow reduces that load by keeping the parts that still matter and compressing or discarding the rest.
In practice, developers do this in a few ways. They restate the current objective in one sentence, provide only the files or functions that matter, ask Claude to summarise previous work into a short handoff note, or start a fresh thread with a compact brief instead of continuing a sprawling session. If you are using API-based tooling, prompt caching can also help because repeated project context does not need to be paid for at the full rate every time. For related model and access details, see our Claude API overview and Claude features page.
Compact does not mean “remove all detail.” The goal is selective retention. Keep stable facts such as architecture, coding standards, open bugs, and the exact task. Drop bulky logs, repeated explanations, and code that Claude no longer needs to reason about the next step.
-
State the current task
Start with one tight instruction such as
Refactor the auth middleware to support role-based checks without changing the public API. -
Attach only the relevant context
Pass the specific files, functions, error messages, or test failures involved now. Avoid dumping the whole repository if only three files matter.
-
Ask for a compressed handoff
Use a prompt like
Summarise what we changed, what is still broken, and which files matter in under 200 words.That summary becomes your next-session context. -
Start fresh when the thread gets noisy
Open a clean session and paste the compact handoff instead of carrying every previous exchange forward.
This approach matters more as context windows get larger, not less. Claude supports large context on certain models, including up to 1,000,000 tokens on Claude Opus 4.7, Opus 4.6, and Sonnet 4.6 at standard rates according to Anthropic’s published pricing and model documentation. A big context window is useful, but bigger windows also make it easier to be sloppy. Compacting keeps the model attentive and your spend more predictable.
90% off
cached input tokens with prompt caching
If you are building your own Claude-powered coding workflow, this cost optimisation is often more important than chasing the largest model. A clean reusable project prompt plus caching is usually better than sending a bloated prompt on every call.
What you’d actually do with it
Here are concrete cases where a compact workflow helps. These examples are less about a hidden feature and more about how skilled users keep Claude Code useful across longer sessions.
1. Compress a debugging session before it sprawls
Say you have spent 30 minutes pasting stack traces, trying two fixes, and discussing edge cases. Instead of continuing in the same overloaded thread, ask for a compact handoff:
Summarise this debugging session for a fresh Claude Code run.
Include:
- root cause we suspect
- fixes already attempted
- files involved
- exact failing test
- next best step
Keep it under 180 words.
You then start a clean session with that summary plus the failing test output. The benefit is not only lower token use. It also reduces the chance that Claude keeps anchoring on abandoned hypotheses.
2. Refactor one module without dragging in the whole repo
Many developers overshare context. If the task is isolated, make the scope explicit:
Work only on these files:
- src/auth/middleware.ts
- src/auth/policies.ts
- tests/auth.middleware.spec.ts
Ignore unrelated services unless a dependency is required.
Goal: add role hierarchy checks and keep current route signatures unchanged.
This is compacting by boundary-setting. You are telling Claude what not to think about. That often improves answer quality as much as it lowers context growth.
3. Create a reusable project brief for repeat work
If you work on the same application every day, create a stable short brief and reuse it:
Project brief:
- Next.js app with TypeScript
- PostgreSQL via Prisma
- Stripe subscriptions
- We prefer small pure functions and explicit types
- Write tests with Vitest
- Do not change public API routes without asking
Current task:
Fix duplicate webhook handling in billing sync.
This keeps each new Claude Code session compact from the start. If you are using API workflows, repeated stable context can pair well with caching.
4. Estimate token cost for a longer coding loop
Suppose you repeatedly send a 200,000-token project context and get back 20,000 tokens of output on Claude Sonnet 4.6. The raw cost without caching would be straightforward to estimate from Anthropic’s published token pricing.
Worked example
Five long Sonnet 4.6 coding turns without caching
A compact workflow can lower repeated input, and prompt caching can reduce cached input cost much further.
The point is not that $4.50 is always expensive. The point is that repeated unnecessary context compounds quickly in real engineering work.
5. Hand off a session to yourself or a teammate
Compact summaries are useful even outside solo prompting. Ask Claude Code for a strict handoff note:
Create a handoff note for the next engineer.
Format:
1. Goal
2. Current status
3. Files changed
4. Open questions
5. Next command to run
Keep it concise and factual.
This is helpful on Team or Enterprise setups where several people may touch the same workspace. Anthropic’s team plans add shared workspace and admin features, but clear compact handoffs still matter more than tooling alone.
Pick when
- Your session is getting long and repetitive
- You want lower token use
- You need Claude to focus on the current bug or refactor
- You plan to restart with a clean prompt
Skip when
- The task depends on broad cross-repo context
- You have not yet identified which files matter
- The omitted history contains decisions Claude still needs
- You are compressing so aggressively that key constraints disappear
Vs. the alternatives
People searching for “claude code compact” are often comparing Claude-based coding workflows with tools like Cursor, GitHub Copilot, or Sourcegraph Cody. The real comparison is not just model quality. It is how well each option manages context, how much control you get, and whether you want a chat-first workflow or a heavily integrated editor experience.
| Option | Where it fits | Strengths | Trade-offs |
|---|---|---|---|
| Claude Code with compact workflow | Developers who want deliberate control over context | Strong reasoning, flexible summaries, works well for handoffs and long tasks | Requires more prompt discipline; compacting is partly a user habit, not magic |
| Cursor | Editor-centric coding with strong inline assistance | Tight IDE workflow, convenient codebase operations | May feel more opinionated; context handling is less transparent to some users |
| GitHub Copilot | Fast completions and common coding help inside popular editors | Low friction, broad adoption, good for short iterative work | Less suited to long structured reasoning unless paired with careful prompting |
| Sourcegraph Cody | Code search and repository-aware assistance | Useful for understanding larger codebases | Value depends heavily on your code search and repo workflow |
| Direct Claude API workflow | Teams building custom coding agents or internal tools | Fine control over prompts, caching, batching, and model choice | More engineering work to build and maintain |
The honest trade-off is this: Claude-style compact workflows reward explicit thinking. If you want the assistant to infer everything from a giant repository plus a vague request, a more IDE-opinionated tool may feel easier. If you want to control what context is in play and why, Claude is often a better fit.
Other questions readers ask
If you are deciding between consumer plans and developer usage, our pricing guide and API page cover the split in more detail. If you want the broader product view first, start from the c-ai.chat homepage.
The honest take
Claude Code compact is not a magic button that fixes bad prompts. It is a practical way of working: keep the current objective clear, keep the relevant files close, compress the history, and restart cleanly when a session gets noisy. If you regularly work on multi-step coding tasks, this habit can improve focus, reduce token waste, and make Claude more consistent from one session to the next.
For most developers, the best setup is simple: use Claude Sonnet 4.6 as the default, move to Opus 4.7 when the reasoning load is higher, and treat compact summaries as part of your normal workflow. If you need official access, use claude.ai for the product and Anthropic’s documentation for the technical details; if you want independent explanations, comparisons, and context, keep this page alongside our Claude Code guide.
Independent guide. Not affiliated with Anthropic. For the official Claude product, visit claude.ai.
Last updated: 2026-05-12






