Claude Code's Hidden Ultra Plan Mode Runs 4 Agents In Parallel In The Cloud

Anthropic quietly shipped Ultra Plan — a cloud-hosted Claude Code planning mode that spins up 3 exploration agents and 1 critique agent in parallel using Opus 4.6. Nate Herk's tests: ultra plan finished in 5 minutes what local plan still hadn't finished after 45.

The VIP Desk

5 min read·May 13, 2026·Summarizing Nate Herk

the-prompt-vip

Anthropic dropped a feature into Claude Code with almost no announcement: Ultra Plan, a cloud-hosted planning mode that runs Opus 4.6 in a multi-agent orchestration on Anthropic's infrastructure, then ships the plan back to your terminal. Nate Herk just stress-tested it and the speed delta is hard to ignore.

Ultra Plan finished in ~5 minutes what local plan mode still hadn't finished after 45 minutes on the same prompt.

Not a typo. Nate had to sit down to record the reveal because he couldn't believe how long the local version took. Here's what's under the hood and why this is worth turning on tomorrow.

The architecture under the slash command

A reverse-engineering pass through the source code, docs, and community forums surfaced this comparison:

Dimension	Local plan	Ultra Plan
Where it runs	Your terminal	Anthropic's cloud container runtime
Model	Whatever your session is using	Always Opus 4.6
Planning approach	Single agent, linear thinking	3 parallel exploration agents + 1 critique agent
Max compute time	Bounded by your session	30-minute cap in the cloud
Terminal	Blocked while planning	Free — you can keep working
Review surface	Text in terminal	Doc-style web UI with comments, emojis, diagrams

The killer detail buried in there: multi-agent exploration. Local plan mode thinks linearly through a problem. Ultra Plan spawns three Opus 4.6 agents that explore the project in parallel, plus a critique agent that grades and merges the proposals. That's why it's faster and the plan is better.

The slash command and how it works

/ultraplan build me a dashboard with revenue, churn, customer breakdown...

Or just include the words "ultra plan" anywhere in your prompt — Claude Code highlights the phrase and asks if you want to send it to the cloud. The flow:

CLI sends prompt to cloud
Cloud spins up 3 Opus 4.6 exploration agents + 1 critique agent
They explore your synced Git repo, propose plans, critique, merge
Web UI opens with the final doc-style plan + optional diagrams
You comment on specific sections, add emoji reactions, send revisions back
Approve → "teleport back to terminal" → execution happens locally

The terminal stays free during the cloud planning step. You can keep working in the same session while Ultra Plan churns in the background, then catch the plan when it lands.

The execution speedup nobody can fully explain yet

Nate's dashboard build, side by side:

Ultra Plan version: plan in ~5 min, build in ~10 min = ~15 min total
Local plan version: plan took 15+ min, build took 30+ min = ~45 min total

The plan itself being better explains the planning speedup. But Nate noticed something stranger: the LOCAL build phase was also faster on the Ultra-Plan-generated plan than on the local plan. Same execution model, same machine, different plan. The Ultra Plan version executed cleanly; the local version backtracked, asked clarifying questions, and seemed to grope its way through.

The hypothesis: a multi-agent-critique plan is structurally clearer (less ambiguity, less hedging, less "I'll figure that out when I get there") and that clarity propagates through the execution phase. You're paying for better planning and getting better execution as a free bundled improvement.

Token cost — about 1% of a Max 20× session per ultra plan

The planning compute happens in the cloud and doesn't show up in /cost. Nate ran an API-billing experiment to measure: a local plan cost ~1.5M input tokens + 23K output tokens. Ultra Plan eats roughly 1% of his Max 20× plan budget per invocation — measurable but not crazy.

The trade-off you should internalize: Ultra Plan trades 50K-ish extra cloud tokens for 5-10× faster wall time and a cleaner downstream build. For anything substantive (dashboards, refactors, multi-file features), that's the obvious win. For small tweaks where local plan would finish in 30 seconds, save the budget.

Three gotchas

Requires a Git/GitHub sync. Cloud agents need to see your codebase, and they pull it from the synced repo. Local-only projects → "You can't do this on the web yet." Init Git, push, retry.
CLI only. Ultra Plan doesn't fire in the desktop app or VS Code extension. Run from the terminal or it silently does nothing.
Skills don't always auto-invoke. Ultra Plan reads your project but sometimes ignores skills it doesn't know to look for. Nate had to tell it to use his visualization skill — the cloud agents didn't see it the first time. Fix: name the skill explicitly in your prompt ("use my visualizations skill") rather than expecting auto-discovery.

The Abraham Lincoln rule for this

Nate quoted Lincoln on this one and it's the right frame:

"Give me 6 hours to chop down a tree, and I will spend the first four sharpening the axe."

Ultra Plan is sharpening the axe. The token cost up front is buying you better cuts downstream. The local-plan habit most people fall into — "I'll just describe the feature and let Claude figure it out" — is the equivalent of starting to chop with a dull axe and wondering why everything takes forever.

Where this product is going

Ultra Plan is currently a research preview. That status alone tells you Anthropic isn't done. The features that will likely show up next:

/cost visibility for cloud planning so you can see exactly how many tokens an Ultra Plan burned
Better skill auto-discovery so the cloud agents pull from your full skill library without prompting
Tighter integration with Managed Agents so you can hand a plan directly to a managed-agent runtime to execute on a schedule
The agent-team pattern made explicit — Anthropic exposing the critique agent's reasoning would help builders understand why a plan was structured a certain way

If any of those land, Ultra Plan moves from "interesting research preview" to default-on for serious work.

What to actually do this week

Try /ultraplan on your next non-trivial feature — a refactor, a dashboard, a multi-file change. Time it against your usual local plan and decide.
Stop using local plan for anything that takes more than 2 minutes to plan. That's the threshold where Ultra Plan's wall-time win pays off.
Init Git on any Claude Code project you haven't synced. Even if you don't need Ultra Plan yet, the cloud-mode features now require it as a baseline.
Name your skills explicitly in Ultra Plan prompts until skill auto-discovery improves. "Use my X skill" is one sentence and prevents a 5-minute do-over.

The Bottom Line

Ultra Plan is the cleanest demonstration yet that multi-agent planning + bigger compute beats single-agent + local compute for the work most Claude Code users actually do. It also shows that Anthropic is quietly shifting the high-value work from the terminal to the cloud — the model that runs locally stays the same, but the smart cousin lives in their datacenter. Ten extra dollars of cloud compute for ten extra hours of your time is the cheapest trade Claude Code has ever offered. Turn it on.

the-prompt-vipClaude CodeUltra PlanOpus 4.6multi-agent planningAI agentsClaude Code webparallel agentsNate Herkcloud Claude Codeagent orchestration