Why Your Claude Sessions Keep Breaking (And How to Actually Fix It)

If you've ever been deep in a Claude session — code flying, ideas connecting — and then hit that wall where Claude starts forgetting things or the conversation ...

Madison

3 min read·Apr 21, 2026·Summarizing Nate Herk

If you've ever been deep in a Claude session — code flying, ideas connecting — and then hit that wall where Claude starts forgetting things or the conversation breaks down, Nate Herk just dropped the explanation I didn't know I needed.

His latest video dives into exactly why this happens, and more importantly, how to never let it derail your workflow again.

Your Context Window Is Working Memory — Not Storage

Nate explains that Claude's context window is essentially its working memory for your conversation. While Claude supports up to 1 million tokens, the catch is that 8,000 to 62,000 tokens get consumed just on startup — before you've typed a single word.

Here's where it gets wild: every message you send doesn't just add a little to the conversation. It compounds. By the time you're on message 30, that message costs Claude 31x more processing than your very first one. And 98.5% of those tokens are Claude re-reading your entire conversation history just to answer you.

That compounding is why long sessions start to feel sluggish or context-confused. And it gets worse.

Context Rot Is Real

Nate calls it "context rot" — essentially AI dementia. Research shows Claude's ability to accurately retrieve information from earlier in the conversation drops from 92% accuracy to 78% as the context fills up toward 1 million tokens. Your earlier instructions, decisions, and context? They start slipping.

And if you're relying on Claude's auto-compaction feature — where it summarizes the conversation when it hits capacity — Nate says you're waiting too long. Auto-compaction kicks in at 95% full. By then, you've already lost 70–80% of the fine detail in that compression.

The 60% Rule

His fix? Don't let it get that far. Nate recommends manually compacting at around 60% context usage. For Claude Opus's 1M token window, that sweet spot lands around 120K tokens.

Here's how he actually does it:

/re command — Rewinds to a previous message and drops everything after it. Good for catching a wrong turn before it compounds further.
/clear + paste a summary — Nate's personal go-to. Clear the context entirely, then paste in a clean, hand-written summary of where you are, what decisions have been made, and what files matter.
Custom /session-handoff skill — The most elegant version. This skill automatically generates a structured summary of the session state — key files, active decisions, next steps — so you can open a fresh conversation without losing continuity.

Sub-Agents Are the Real Power Move

What really clicked for me is the sub-agent model. When you spin up a sub-agent in Claude, it gets its own fresh context window — like handing off a research task to someone who hasn't been on your cluttered call all day.

This is exactly how I've been thinking about structuring longer AI workflows. Instead of one massive session that bloats and degrades, you orchestrate a main agent that delegates sub-tasks to fresh agents. Each one operates clean.

It's the difference between one person trying to hold everything in their head all day versus a small team where each person owns their lane.

Why This Changes How I Work

I'm not going to pretend I wasn't hitting this problem constantly. Long coding sessions, long content planning sessions — I kept noticing Claude drifting. Restating decisions I'd already made. Missing context from earlier in the same conversation.

Understanding that it's a mathematical certainty — not a Claude bug — changes how I plan sessions entirely. I'm setting a 60% checkpoint now, building clean handoff prompts, and using sub-agents far more aggressively for research and analysis tasks.

If you use Claude for anything beyond quick questions, Nate's framework is worth understanding. The session limits aren't the enemy. Running out of clean context without a recovery plan is.

aiclaudeAIcontext windowproductivitysession limitsagents