Managing the Context Window
compact, clear, microcompact, and the 60% rule
- Explain the 200K window, what eats into it, and why ~150-170K is actually usable
- Read a context breakdown with /context [all] and a cost breakdown with /usage
- Choose correctly between /compact (summarize, keep instructions) and /clear (reset entirely)
- Distinguish microcompaction, auto-compaction, and a manual /compact — and when each fires
- Adopt the 60% rule and protect critical context by telling CLAUDE.md what to preserve
Context is the fundamental constraint in Claude Code: a 200K-token window that degrades in quality long before it runs out. This lesson teaches the visibility tools (/context, /usage), the compression tools (/compact, /clear, microcompaction, auto-compaction), and the single best habit — compact early at ~60%, not at 95%.
- 1The mental model: context is a budget, and quality drops before it's full
- 2What's actually in the 200K — and the 2026 buffer change
- 3Seeing the budget: /context and /usage
- 4Compact vs. clear: compress or reset
- 5Three kinds of compaction: micro, auto, and manual
- 6The 60% rule: compact early, on your terms
- 7Protecting critical context across compaction
The mental model: context is a budget, and quality drops before it's full
Everything Claude Code knows in a session lives in one place: the context window. It is a fixed-size budget of tokens that holds the system prompt, your tool definitions, every loaded CLAUDE.md and skill, all the files Claude has read, every command it has run, and the entire back-and-forth of your conversation. When that budget is gone, something has to be dropped or compressed.
The number to anchor on is 200K tokens — roughly 1.6 million characters. But you never get all of it for your conversation. The system prompt, tool schemas, your memory files, and a safety buffer all reserve space up front, so the usable window for actual work is closer to 150K-170K tokens.
Here is the part most people miss: the real enemy is quality, not size. Performance degrades as context fills, well before you hit the hard limit. Raw command logs, the same file read five times, a long-finished debugging tangent — all of it stays in the window competing for the model's attention with the three lines that actually matter right now. A window that is 70% full of noise is worse than one that is 40% full of signal. Managing context is really about keeping the signal-to-noise ratio high, not just avoiding the ceiling.
200K-TOKEN WINDOW (~1.6M chars)
├── System prompt + tool definitions ┐
├── CLAUDE.md / memory / skills ├─ reserved overhead
├── Safety buffer (~33K) ┘
└── YOUR WORKING SPACE (~150-170K usable)
files read · commands run · the conversation
↑ degrades in QUALITY as it fills, not just at the limitKey insight
Size is the ceiling; quality is the floor
You will feel context pressure as sloppier answers, forgotten earlier decisions, and re-reading of files long before /context shows you near 100%. Treat a window that is mostly stale logs and repeated reads as already 'full' for practical purposes, even if there's headroom on paper.
What's actually in the 200K — and the 2026 buffer change
The window is divided into fixed overhead plus your working space. Knowing the split tells you where your budget actually goes.
| Region | What lives here | Roughly |
|---|---|---|
| System prompt + tools | Claude Code's instructions and tool schemas | fixed overhead |
| CLAUDE.md / memory / skills | Your loaded instruction and memory files | fixed overhead |
| Safety buffer | Reserved headroom near the limit | ~33K (16.5%) |
| Working space | Files, command output, the conversation | ~150-170K usable |
A notable 2026 change: the reserved safety buffer was reduced to about 33K tokens (16.5%) of the window. The previous, larger buffer left less room for work; the smaller one gives you roughly 12K more usable tokens than the 2025 default — a free upgrade in working space without changing the 200K headline number.
Note that some configurations support a larger window (up to 1M tokens), but the 200K window is the default you should plan around. Don't assume the big window; design your workflow for 200K and treat anything larger as a bonus.
Note
The overhead is mostly yours to tune
You can't shrink the system prompt, but the CLAUDE.md / skills / memory slice is under your control. A bloated 400-line CLAUDE.md or a pile of always-loaded skills eats working space on every single turn. /context shows exactly how much each is costing — covered next.
Seeing the budget: /context and /usage
You can't manage what you can't see. Two commands give you the full picture — one for tokens, one for money.
/context [all] renders a colored grid of where your tokens are going right now: the system prompt, tool definitions, CLAUDE.md, skills, MCP servers, and the conversation itself, each as a slice of the 200K. It also surfaces optimization tips. Run it whenever a session starts feeling heavy — it's how you find the surprise 30K skill or the screenshot that quietly ate a quarter of your window. Pass all for the fully expanded breakdown.
/context # quick token breakdown + tips
/context all # fully expanded view of every contributor/usage (aliases /cost, /stats) reports the cost and consumption side: this session's spend, your plan usage against limits, and — critically — a breakdown by skill, subagent, plugin, and MCP server. That per-source breakdown is how you discover that one chatty MCP server or one runaway subagent is responsible for most of your token (and dollar) burn.
/usage # session cost + plan usage + per-source breakdownThink of it as two dashboards: /context answers "what is filling my window right now?" and /usage answers "what is this costing me, and which component is to blame?"
Tip
Make /context a reflex
Run /context at the start of a fresh task and again when answers start drifting. The colored grid makes outliers obvious instantly — an image-heavy section, an MCP server loading huge schemas, or a CLAUDE.md that has grown past its useful size. You fix what you can see.
Compact vs. clear: compress or reset
When the window gets heavy you have two fundamentally different tools. Picking the wrong one either loses context you needed or keeps clutter you wanted gone.
/compact [instructions] summarizes the conversation. Claude replaces the long history with a compressed summary that preserves the important parts — code patterns, file states, key decisions — and keeps your CLAUDE.md, skills, and memory intact. You can steer it with an optional focus, e.g. /compact focus on the auth refactor and the failing tests, which tells Claude what to prioritize keeping. Use compact when you want to continue the same task with a lighter, sharper context.
/clear (aliases /reset, /new) resets the context entirely — an empty window, as if you just started. It does not undo file edits (your code on disk is untouched) and the prior conversation is still resumable via /resume. Use clear when you're switching to an unrelated task; carrying the old conversation's residue into new work is the classic 'kitchen-sink' anti-pattern that bloats context for no benefit.
/compact [focus] | /clear | |
|---|---|---|
| Effect | Summarizes history into a compressed form | Empties the context window |
| Keeps CLAUDE.md / skills / memory? | Yes | Reloads fresh (they re-inject) |
| File edits on disk | Untouched | Untouched |
| Prior conversation | Replaced by summary (still on disk) | Resumable via /resume |
| Use when | Continuing the same task | Starting an unrelated task |
The rule of thumb: same task, getting heavy → compact. New, unrelated task → clear.
Example
Two everyday calls
Compact: You've spent an hour implementing a feature; the window is full of file reads and test output but you're not done. Run /compact focus on the remaining TODOs and the API contract to keep going lighter.
Clear: You finish the feature and now need to fix an unrelated bug in a different module. Run /clear and start clean — there's no reason for the new task to inherit the old one's history.
Watch out
Compaction can drop hard boundaries
A summary is lossy. Instructions you only said in chat — like 'don't push to main' — can be summarized away. For guarantees that must survive compaction, put them in a deny permission rule or a hook, not just a message. Compaction is for context, not for enforcing rules.
Three kinds of compaction: micro, auto, and manual
Compaction isn't one thing — three distinct mechanisms operate at different times, and understanding the differences is what separates fumbling from control.
Microcompaction clears stale tool results — old command output and file reads that are no longer relevant — without making a model call. It's cheap, automatic, and surgical: it trims the low-value clutter (the noise) while leaving your conversation and decisions fully intact. No summarization, no cost, no lost reasoning.
Auto-compaction kicks in automatically as you approach the limit. It summarizes the conversation — code patterns, file states, key decisions — into a compressed form so the session can keep going. It's the safety net that stops you from hitting a hard wall mid-task, but because it fires near 95% full and you don't control its timing, the summary is often coarser and the model is already operating in a degraded, crowded window.
Manual /compact is you choosing the moment. And the moment matters enormously.
| Mechanism | Trigger | Model call? | What it does |
|---|---|---|---|
| Microcompaction | Automatic, ongoing | No | Clears stale tool results (old logs, repeated reads) |
| Auto-compaction | Automatic, near the limit (~95%) | Yes | Summarizes the conversation to keep going |
Manual /compact | You run it (best at ~60%) | Yes | Summarizes on your terms, with optional focus |
Key insight
Microcompaction trims noise; auto/manual compaction summarizes signal
Microcompaction is free and conservative — it only removes tool results that have gone stale, so it never costs you a model call or your reasoning. Auto- and manual compaction are the heavier move: they replace conversation history with a summary. Knowing which is which tells you why some context shrinks silently (micro) and some shrinks with a noticeable 'compacting...' step (auto/manual).
The 60% rule: compact early, on your terms
This is the single most valuable habit in the lesson. Run /compact manually at around 60% full — not at 95%.
Why early beats late:
- Sharper summaries. At 60%, Claude has plenty of headroom to write a thorough, well-organized summary. At 95%, auto-compaction is summarizing under pressure in an already-degraded window — the result is coarser and loses more.
- Cheaper. Summarizing a 60%-full window costs less than summarizing a 95%-full one, and you avoid the quality tax of working in a crowded window in the meantime.
- You control the focus. A manual
/compact focus on Xlets you tell Claude exactly what to preserve. Auto-compaction makes that call for you, and it can guess wrong. - No surprise stalls. Waiting for auto-compaction means it fires mid-task at an unpredictable moment. Compacting proactively keeps you in command of when the pause happens.
The practical loop: keep /context handy, and when working space crosses roughly 60%, run /compact with a focus instruction. Treat auto-compaction as the emergency backstop you'd rather never hit — not your default behavior.
0% ────────── 60% ────────── 95% ───── 100%
▲ ▲
compact HERE auto-compaction
(manual, sharp) (backstop, coarse)Tip
Pair compacting with clearing
The two habits work together. Within a long single task, /compact at ~60% to stay sharp. Between unrelated tasks, /clear so the next task starts clean. Compact keeps the same work lean; clear stops cross-task contamination. Most heavy users do both, constantly.
Protecting critical context across compaction
Because any summary is lossy, you sometimes need to guarantee that a specific piece of context survives a compaction. The lightweight way to do this is to tell CLAUDE.md what to keep.
Add a line to your project's CLAUDE.md in the form 'when compacting, preserve X' — for example, naming the API contract, the migration plan, or the invariant the whole task depends on. Because the project-root CLAUDE.md is re-injected from disk after a /compact, that instruction is present when Claude builds the summary, nudging it to retain what you flagged.
## Context management
- When compacting, preserve: the database migration order, the
public API contract in api/contract.md, and any decision marked DECISION:.
- Do not summarize away open TODO items; keep them verbatim.This is advisory, not a hard guarantee — CLAUDE.md is delivered as context, so compliance is a strong nudge rather than a lock. For things that absolutely must not be lost or violated (like 'never push to main'), reach for a deny permission rule or a hook instead, which are enforced deterministically. Use the CLAUDE.md preserve-line for content you want carried forward, and rules/hooks for boundaries that must hold.
Watch out
Advisory, not enforced
A 'when compacting, preserve X' line increases the odds X survives — it does not guarantee it. Critical, non-negotiable constraints belong in deny rules or hooks. Keep the CLAUDE.md preserve-line for important information, not for safety boundaries.
Try it: Watch the budget, then take control of it
Practice seeing and shaping the context window on a real session.
- Baseline: start
claudein a real project and immediately run/context. Note the slices — system prompt, tools, CLAUDE.md, skills, MCP. Record roughly how much working space you have before doing anything. - Fill it: ask Claude to read several files and run a few commands (a build, a test run, a
git log). Re-run/contextand watch the conversation slice grow. Notice raw command output piling up. - Find the cost: run
/usageand read the per-source breakdown. Identify which skill, MCP server, or subagent (if any) is the biggest single contributor. - Compact early: once working space is around 60%, run
/compact focus on the current task and any open TODOs. Re-run/contextand confirm working space dropped — and that your CLAUDE.md/skills are still loaded. - Protect context: add a line to the project
CLAUDE.mdsuch asWhen compacting, preserve: the list of files we have changed and any DECISION: notes.Force another/compactand check whether that information survived in the summary. - Reset cleanly: when you switch to a different, unrelated task, run
/clearand confirm with/contextthat the window is fresh — then verify your earlier conversation is still reachable via/resume.
Deliverable: a short note recording (a) your usable working space at baseline, (b) the single biggest contributor /usage surfaced, (c) the before/after working-space numbers around your 60% compact, and (d) whether your 'when compacting, preserve X' line actually survived — and one sentence on when you'd reach for a hook or deny rule instead.
Key takeaways
- 1The context window is 200K tokens (~1.6M chars); after system prompt, tools, memory, and a buffer, ~150-170K is usable for actual work.
- 2Degradation is about quality, not just size — stale logs, repeated reads, and forgotten history compete with what matters, so high signal-to-noise beats raw headroom.
- 3`/context [all]` shows the token breakdown; `/usage` (`/cost`, `/stats`) shows cost plus a per-skill/subagent/plugin/MCP breakdown.
- 4`/compact [focus]` summarizes and keeps CLAUDE.md/skills/memory; `/clear` resets entirely (file edits preserved, prior conversation resumable) — compact for the same task, clear for a new one.
- 5Microcompaction clears stale tool results with no model call; auto-compaction summarizes near the limit; the best practice is a manual `/compact` at ~60%, not 95%.
- 6The 2026 buffer dropped to ~33K (16.5%), giving ~12K more usable; protect must-keep context with a 'when compacting, preserve X' line in CLAUDE.md (advisory — use deny rules/hooks for hard boundaries).
Quiz
Lock in what you learned
Check your understanding
0 / 4 answered
1.Your session is about 70% full, mostly with long command logs and the same three files read several times, and Claude is starting to forget earlier decisions. Why is this a problem even though /context shows headroom remaining?
2.You've just finished an authentication feature and now need to fix a completely unrelated bug in the billing module. Which command best prepares your context, and what happens to your work?
3.What is the key difference between microcompaction and auto-compaction?
4.Why does the best practice say to run /compact manually at around 60% rather than letting auto-compaction handle it at ~95%?
Go deeper
Hand-picked sources to keep learning
The 200K window, what fills it, /context, /compact, /clear, microcompaction, and auto-compaction.
How /usage reports session cost, plan usage, and the per-skill/subagent/plugin/MCP breakdown.
Deep dive on how compaction works in practice and why compacting early produces better results.
Where to add a 'when compacting, preserve X' line and how CLAUDE.md re-injects after /compact.
Exact syntax and aliases for /context, /usage (/cost, /stats), /compact, and /clear (/reset, /new).
Source, issues, and changelog tracking context-window and compaction behavior across versions.