Understanding Claude Code's Context Window
Every Claude Code session has a context window — roughly 200,000 tokens of memory that holds your entire conversation. Every message you send, every file Claude reads, every tool call result, and every response Claude gives goes into this window. Nothing gets erased between turns. Once it fills up, things start going wrong.
How the context window works
Think of the context window as a notepad with a fixed number of pages. At the start of a session, it's empty. As you work:
- You send a message — that's a few hundred tokens added to the notepad
- Claude reads a file — the entire file content goes onto the notepad (a 500-line file might be 3,000-5,000 tokens)
- Claude calls a tool — the tool's output goes onto the notepad
- Claude responds — the response is written onto the notepad
On every subsequent turn, Claude re-reads the entire notepad to generate its response. This is important: Claude doesn't just look at your latest message — it processes everything in the context window from the beginning, every single time.
What takes the most space?
File reads are usually the biggest consumer. A single Read of a large file can add thousands of tokens. If Claude reads the same file multiple times across a session (common during iterative debugging), each read adds a fresh copy. Tool results from search operations, directory listings, and build outputs also accumulate quickly.
Why quality degrades as context fills
A packed context window causes two distinct problems:
Before auto-compaction: attention dilution
Claude uses an attention mechanism to decide which parts of the context to focus on. When the context is small and focused, Claude can easily find the relevant information. As the context fills with old file reads, stale tool results, and earlier conversation turns, Claude has to sift through thousands of irrelevant tokens to find what matters.
The result: responses get slower and less accurate. Claude might miss a constraint you mentioned 50 messages ago, or fail to connect two related pieces of information because they're buried in noise. You'll notice this as Claude asking for information it already has, or making mistakes it wouldn't make in a fresh session.
After auto-compaction: information loss
When the context window reaches roughly 80% capacity, Claude Code triggers auto-compaction. This is an automatic process that:
- Takes the entire conversation history
- Summarizes it into a few paragraphs
- Discards the original content
- Continues the session with only the summary
The summary is lossy by design. It can't preserve everything. What typically gets lost:
- Exact file contents — the summary might say "we edited utils.ts" but won't contain the actual code
- Specific instructions — detailed constraints or preferences you set early in the session
- Nuanced decisions — why you chose approach A over approach B
- Error details — exact stack traces or error messages from earlier debugging
After compaction, Claude continues working — but from a summary instead of the real thing. It might re-read files it already read, forget constraints you set, or redo work it already completed.
What fills up context fastest
Not all actions consume context equally. Here's what tends to fill it up, ranked by impact:
- Large file reads — reading a 1,000-line file can cost 5,000-10,000 tokens. Reading it three times during a session costs that three times over.
- Build and test output — a failing build with verbose error output can dump thousands of tokens into context in one shot.
- Search results — grep/ripgrep results across many files add up fast, especially with context lines.
- Long Claude responses — when Claude writes multi-file changes or explains complex logic, the output tokens count too.
- Your messages — usually the smallest contributor, but pasting large code blocks or long instructions adds up.
Manual compaction with /compact
Claude Code provides a /compact command that lets you trigger compaction manually. This is better than waiting for auto-compaction because:
- You choose when — compact at a natural breakpoint, not mid-task
- You can add instructions — run
/compact focus on the auth refactorto guide what the summary prioritizes - You keep more headroom — compacting at 50% gives you a clean slate with room to work, versus auto-compacting at 80% when you're already degraded
When to use /compact vs. starting fresh
Use /compact when you're mid-task and need to continue but have accumulated too much noise — old file reads, resolved errors, abandoned approaches. Start a fresh session when switching to a completely different task, or when the current session has been compacted already and quality still isn't great.
Practical strategies for managing context
Start focused sessions
Instead of using one long session for everything, start separate sessions for distinct tasks. A session for "fix the auth bug" and another for "add notification support" will each have clean context dedicated to their task.
Be selective about file reads
Point Claude at specific files rather than asking it to explore. Read src/auth/login.ts is cheaper than "look through the auth directory for the login logic." Every file Claude reads stays in context for the rest of the session.
Avoid repeated reads
If Claude already read a file earlier in the session, it's still in context — Claude can reference it without reading it again. If you notice Claude re-reading files, that's a sign the context is getting noisy and Claude is losing track.
Watch for quality drops
Signs that context is getting too full: Claude asks for information it already has, makes mistakes it didn't make earlier, gives longer but less precise responses, or starts re-reading files it already processed. These are signals to compact or start fresh.
Use AI Battery to monitor
AI Battery shows real-time context fullness for your 5 most recent Claude Code sessions. You can see at a glance which sessions are getting crowded and need attention — without guessing.
Monitor your context health in real time
AI Battery shows context fullness for your recent sessions with color-coded warnings — green, orange, and red — so you know when to compact or start fresh.
Download AI Battery — FreeUnderstanding the color indicators
AI Battery uses a traffic-light system for context health:
- Green (under 60%) — plenty of room. Claude has full recall of everything in this session. Work freely.
- Orange (60-80%) — getting crowded. Quality starts to slip. Consider running
/compactor starting a fresh session for new tasks. - Red (over 80%) — auto-compaction is imminent. Start a new session to get the best results. Continuing here means working from a lossy summary.
Common questions
How many tokens is 200K in practice?
Roughly 150,000 words, or about 500 pages of text. That sounds like a lot, but file reads, tool outputs, and Claude's own responses consume tokens quickly. A busy coding session can fill half the window in 30-60 minutes.
Does context reset between sessions?
Yes. Each Claude Code session starts with a fresh, empty context window. Closing a session and opening a new one gives you a full 200K tokens to work with. This is why starting fresh is often better than trying to salvage a packed session.
Can I increase the context window size?
No. The ~200K token limit is set by the model and can't be configured. It's the same across Pro, Max 5x, and Max 20x plans. The only way to get more effective context is to use it more efficiently.
Does /compact lose everything?
Not everything — it creates a summary of the conversation so far. But it is lossy. Exact code contents, specific error messages, and detailed instructions get compressed. The more focused your compaction prompt (/compact focus on X), the better the summary preserves what matters.
How does AI Battery measure context fullness?
AI Battery parses Claude Code's local JSONL conversation logs to count the tokens in each session. It tracks input tokens, output tokens, cache reads, and cache writes to calculate how full each session's context window is. Everything runs locally — no network requests, no account required.
What's the relationship between context health and rate limits?
They're separate concerns. Rate limits control how much total usage you get over time (5-hour and 7-day windows). Context health is about the memory within a single session. You can have plenty of rate limit headroom but a packed context, or vice versa. AI Battery tracks both independently.