Multi-Agent Orchestration at Scale/Lesson 1 of 6

Choosing an Orchestration Pattern

Subagents, workflows, agent teams, and background agents

Advanced 13 minBuilderDecision-maker

What you'll be able to do

Distinguish the four multi-agent paradigms — subagents, dynamic workflows, agent teams, and background agents / agent view — by communication model, context handling, scale, and cost
Apply a one-line decision rule to classify a real task to the right paradigm (deterministic fan-out, agents that must talk, run-while-away, or isolate one heavy read)
Reason about cost: subagents summarize back, teammates each hold a full window (most expensive per agent), workflows keep intermediate state in script variables
State each feature's availability honestly — what is GA, research preview, experimental/off-by-default, and cloud-only — and defer to /help and the live docs for your version
Use the correct official vocabulary (agent teams), and avoid non-terms like 'agent fleet' or 'managed agents'

At a glance

The subagents lesson taught you to delegate ONE worker. This lesson is the map of everything above it: the four ways Claude Code coordinates MANY agents. You will learn to tell apart subagents (in-session delegates that summarize back), dynamic workflows (a script that fans out up to 1,000 agents, 16 at a time), agent teams (peer SESSIONS that message each other and share a task list), and background agents / agent view (human-supervised independent sessions). A decision matrix across parallelism, communication, context, scale, cost, and status gives you one clear rule for picking the right paradigm — and an honest warning about which features are research preview or experimental.

1Above a single subagent: the orchestration map
2The four paradigms, side by side
3The decision rule
4Dynamic workflows: scripted fan-out at scale
5Agent teams: peer sessions that talk
6Background agents & agent view: run while you're away
7Cost intuition and an honest status check

Above a single subagent: the orchestration map

The previous lesson covered the single delegate: spawn one subagent, it works in an isolated context, it hands back a summary. That is the floor. This lesson is the map of everything above it — the four distinct ways Claude Code coordinates many agents at once.

Learners conflate these constantly, so start with the mental model and hold it firmly: the paradigms differ along one axis — how the agents communicate and where their intermediate work lives. Get that axis right and every other property (scale, cost, status) falls out of it.

text

                     COORDINATING MANY AGENTS

  Subagents        Dynamic workflows     Agent teams      Background /
  (in-session)     (scripted fan-out)    (peer sessions)  agent view
  ───────────      ──────────────────    ─────────────    ───────────
  results to       results in script     mailbox +        human
  main context     variables             shared tasks     supervises
  a few            up to 1,000 / 16      3-5 sessions     a handful
  GA               research preview      EXPERIMENTAL     research preview

The single most useful sentence in this module: a subagent lives inside one session and reports a summary up; the three paradigms above it run as their own thing — a separate script runtime, separate peer sessions, or separate background sessions. That separation is exactly why each scales, costs, and communicates differently.

Watch out

Vocabulary: there is no 'agent fleet' or 'managed agents'

Those are NOT official Claude Code terms. The official term for coordinated multi-session work is agent teams. If you see 'fleet' or 'managed agents' in a blog post, mentally translate to the real paradigm (usually agent teams or background agents) before you reason about it.

Watch out

These features move fast — defer to your live docs

Dynamic workflows, agent view, and routines are research preview; agent teams are experimental and off by default. Flags, caps, and even the scripting-API surface change week to week. Treat exact numbers here as point-in-time and check /help, claude --version, and code.claude.com/docs for your installed version.

The four paradigms, side by side

Here is the spine of the module — the one table to internalize. Read it down the Communication model column first; that is the axis everything else hangs on.

Paradigm	What it is	Communication model	Context	Scale	Status
Subagents (already taught)	Delegated workers inside ONE session that report a summary back	None peer-to-peer; results land in the main context	Isolated per agent; cannot nest	a few	GA
Dynamic workflows	A JavaScript script orchestrates subagents at scale; a separate runtime executes it	String concatenation / JSON only — no message bus	Intermediate results stay in script variables, not your context window	up to 1,000 agents/run, 16 concurrent	Research preview (v2.1.154+)
Agent teams	Multiple independent Claude Code sessions as coordinated peers with a lead	Inter-agent messaging (shared mailbox) + shared task list	Each teammate = a full, fresh context window	3-5 recommended	Experimental, OFF by default (v2.1.32+)
Agent view / background agents	Human-supervised independent background sessions	None — a human supervises	Each session full context, often in a git worktree	a handful	Research preview (v2.1.139+)

Notice how communication predicts everything:

Subagents don't talk to each other at all — they answer up to the parent. So they stay cheap (the parent only pays for summaries) but they can't divide a fuzzy problem among themselves.
Workflows also have no peer messaging; the script glues outputs together by string concatenation. That makes them deterministic and scriptable — and lets them go to 1,000 agents because the chatter never hits your context window.
Agent teams are the only paradigm with a real message bus (a mailbox) plus a shared task list. That is what lets peers self-coordinate — and it is why each one needs a full context window, making them the most expensive per agent.
Background agents don't coordinate with each other either; you are the coordinator, checking in via the agent view.

Key insight

Communication model is the master variable

Don't memorize the table cell by cell. Ask one question: do the agents need to talk to each other? If no, and the work is deterministic and large → workflow. If no, and you just want one heavy read isolated → subagent. If no, but you want them running while you're away → background agents. If yes — they must message and divide a fuzzy problem → agent team. Everything else follows.

The decision rule

Four paradigms, one rule. Match the shape of the work to the paradigm:

If the work is...	...use	Why
Deterministic, repeatable, large-scale fan-out (audit 500 files, cross-checked research, a big mechanical migration)	a workflow	A script gives you reproducibility and 1,000-agent scale, with intermediate state out of your context
A fuzzy problem agents must divide and discuss (debate a design, parallel feature work with handoffs)	an agent team	Only teams have a mailbox + shared task list, so peers can self-coordinate
Something to run while you're away and check back on	background agents / agent view, `/loop`, or routines	These run unattended in their own sessions; you supervise asynchronously
One heavy read you want isolated from the main thread	a subagent (already covered)	Cheapest path; isolate the token cost, get a summary back

The failure mode is reaching for the heavyweight tool by default. Most tasks are a subagent or a workflow. Agent teams are powerful but the most expensive and most experimental — save them for problems that genuinely require peers talking to each other. If you can express the work as "do the same thing to many items and combine the results," it is a workflow, not a team.

A worked classification:

text

Task: "Find every SQL query in the repo that lacks parameterization,
       verify each finding, and give me one report."
→ Deterministic, large-scale, no peer discussion needed.
→ WORKFLOW (fan-out finders, separate verifiers, concatenate survivors).

Task: "Three of us — a backend, a frontend, and a test writer — build
       this feature together, handing work back and forth."
→ Fuzzy, agents must coordinate and hand off.
→ AGENT TEAM (mailbox + shared task list).

Task: "Keep an eye on CI and fix the flaky test while I'm at lunch."
→ Run while away, human checks back.
→ BACKGROUND AGENT (claude --bg ...), monitored via `claude agents`.

Task: "Map everywhere we call the payments API so I can plan a change."
→ One heavy, self-contained read.
→ SUBAGENT.

Tip

When in doubt, downgrade

If you're unsure between a team and a workflow, try the workflow first — it's more deterministic, cheaper per agent, and won't leave full context windows running. Reach up to an agent team only when you can name the conversation the agents need to have. Reserve background agents for genuinely unattended work.

Dynamic workflows: scripted fan-out at scale

A dynamic workflow is Claude Code's large-scale orchestrator (research preview, requires v2.1.154+ — check claude --version). You describe a task in natural language; Claude writes a JavaScript script, and a separate runtime executes it, coordinating tens to hundreds of subagents. Hard caps: 1,000 agent calls per run and 16 concurrent (bounded by CPU cores).

The defining behavioral difference from plain subagents: orchestration logic lives in a script, so intermediate results stay in script variables — outside your context window. Only the final, verified answer lands in your session. You don't pay context for every agent's chatter. That is the cost/quality lever.

Three ways to trigger one:

Put the keyword ultracode anywhere in a prompt (natural language like "use a workflow" also works).
Run a saved workflow command — a /<name> you saved earlier from a previous run.
Set /effort ultracode so Claude auto-deploys a workflow when a task warrants it (combines xhigh reasoning + workflow planning; session-only — reset with /effort high).

There's a built-in one worth knowing: /deep-research <question> fans web searches across angles, cross-checks sources, votes on claims adversarially, and returns a cited report with unverified claims filtered out.

The scripting primitives are conceptual — spawn one agent (returns a string), run a batch in parallel (a barrier: all must finish), stream items through a pipeline (no per-stage barrier), and phase/log for the progress display. Data flow between agents is pure string concatenation / JSON — there is no message bus and agents never talk to each other; the script parses and stitches outputs.

Monitor and manage runs with /workflows: arrow keys to select, Enter to drill into a phase (agent count, tokens, elapsed), p pause/resume, x stop, r restart an agent, s to save the run's script as a reusable /<name> command (in .claude/workflows/ for the project or ~/.claude/workflows/ for you). Pause/resume returns cached results for finished agents — but only within the same session; exiting Claude Code restarts the workflow fresh next time.

Note

Don't memorize the API — read the script Claude writes

The official workflows docs describe spawn(definition, prompt, args); third-party writeups describe higher-level helpers like agent(), parallel(), pipeline(), phase(), log(). The exact names differ by source and version. Learn the concepts (spawn-one, run-in-parallel-with-a-barrier, pipeline-without-barriers) and read the generated script before approving it, rather than memorizing a surface that may change.

Watch out

Workflow guardrails (these bite)

No mid-run user input (only permission prompts can pause). The script has no direct filesystem/shell access — all I/O is delegated to subagents. No nested workflows. Subagents spawned by a workflow always run in acceptEdits mode with the parent's allowlist, so file edits are auto-approved. And runaway loops can silently hit the 16-concurrent / 1,000-total caps. Workflows ARE available on Bedrock/Vertex/Foundry.

Agent teams: peer sessions that talk

An agent team is the only paradigm where agents are full peer sessions that message each other. It is experimental and disabled by default (requires v2.1.32+). Enable it explicitly:

bash

export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
# or in settings.json:
# { "env": { "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1" } }

Your main session becomes the lead; it spawns independent teammate sessions, each with its own full, fresh context window. They self-coordinate through two shared structures:

A shared mailbox for inter-agent messaging (the message bus that no other paradigm has).
A shared task list at ~/.claude/tasks/{team-name}/ — states pending → in-progress → completed, dependencies tracked automatically, file locking to prevent races.

You drive a team in natural language to the lead: "Create an agent team to build this feature. Spawn a backend teammate, a frontend teammate, and a test-writer teammate." You can spawn a teammate from a subagent definition by name (its system prompt, tools allowlist, and model carry over) — but note its skills and mcpServers frontmatter are NOT applied; teammates load skills/MCP from project/user settings. Talk to a teammate with Shift+Down (in-process) or by clicking its pane (tmux/iTerm2 split). Quality-gate hooks are team-specific: TaskCompleted (exit 2 blocks a 'done' that fails your bar), TaskCreated, and TeammateIdle (exit 2 keeps an idle teammate working).

Because each teammate is a full context window, cost scales linearly with the number of teammates — this is the most expensive per agent of all four paradigms. Size to 3-5 teammates, ~5-6 tasks each, and don't leave a team running unattended.

Watch out

Experimental — and clean up from the LEAD

/resume and /rewind do not restore in-process teammates; task status can lag (nudge teammates to mark work complete); shutdown can be slow; only one team per lead at a time; no nested teams; the lead role can't be transferred. When done, run cleanup from the lead ("Clean up the team") so shared resources are removed consistently — never from a teammate.

Example

Teams really do scale (with a real bill)

Anthropic's engineering team built a ~100,000-line C compiler — one that compiles the Linux kernel for x86, ARM, and RISC-V — using coordinated sessions across ~2,000 sessions over ~2 weeks, for roughly $20,000 in API spend. The lesson isn't the dollar figure (point-in-time); it's that peer coordination unlocks problems a single session can't hold — at a cost that scales with the agents.

Background agents & agent view: run while you're away

The fourth paradigm is about time, not coordination: independent sessions that run unattended while you do something else, with a human supervising asynchronously. There is no peer messaging — you are the coordinator.

Agent view (research preview, v2.1.139+) is the control room. Open it with claude agents — a unified screen grouped by state (Needs input / Working / Completed) with live counts. Navigate with ↑/↓, Space to peek, Enter to attach, Ctrl+T to pin (pinned sessions survive; unpinned idle ones stop after ~1 hour), Ctrl+X to delete. Dispatch a background session with claude --bg "<prompt>" (optional --name, --agent, --model, --permission-mode); manage shells with claude attach <id>, claude logs <id>, claude stop <id>, claude rm <id>. Background sessions isolate into git worktrees under .claude/worktrees/ before their first edit, so they don't disturb your working tree.

For scheduling there are lighter tools (touched in detail elsewhere in the module):

/loop [interval] [prompt] — session-scoped recurring task (e.g. /loop 30m run the tests); omit the interval to let Claude self-pace; 7-day expiry, max 50 concurrent.
/goal <condition> — Claude pursues a goal across turns until a small fast model judges the condition met from the transcript (so the condition must be provable from Claude's own output).
/schedule (routines) — durable cloud schedules on Anthropic infrastructure, minimum 1-hour interval, with schedule/API/GitHub triggers; these run autonomously with no permission prompts.

Monitoring helpers: the Monitor tool streams a background process's output into the conversation so Claude reacts without a polling loop; PushNotification pings your desktop/phone at a milestone; /tasks lists and stops background streams.

Tip

Match the unattended tool to the trigger

Need it tied to this session and gone in a week → /loop. Need Claude to grind toward a checkable condition → /goal. Need it to survive restarts and fire on a schedule or a GitHub event from the cloud → routines (/schedule). Need to launch heavy work now and walk away → claude --bg, watched in claude agents.

Cost intuition and an honest status check

Two cross-cutting facts thread through every choice: what it costs and whether it's even stable yet.

Cost mental model. Cost scales roughly linearly with active agents, but where the tokens go differs by paradigm:

Subagents preserve context — they summarize back, so the parent pays for summaries, not raw exploration. Cheapest.
Workflows hold intermediate state in script variables, outside your context window. Per-agent context is cheap, but they can spawn hundreds, so the total can be large. Test on a small slice first — one directory, a narrow question — then scale, watching per-phase tokens in /workflows.
Agent teams give each teammate a full context window — the most expensive per agent. Keep teams small (3-5) and don't leave them running.
Background agents each hold a full session too; cost accrues for as long as they run, which is why idle ones auto-stop after ~1 hour.

Track spend with /usage (per skill/subagent/plugin/MCP) and /insights (history and trends). The universal rule: prove the pattern on a small slice before you fan out wide.

Status & availability — say it plainly to yourself before you rely on a feature.

Feature	Status	Availability note
Subagents	GA	Everywhere
Dynamic workflows	Research preview (v2.1.154+)	Available on Bedrock/Vertex/Foundry too
Agent teams	Experimental, OFF by default (v2.1.32+)	Local; enable via env var
Agent view / background agents	Research preview (v2.1.139+)	Local
Ultraplan / ultrareview / routines	Research preview	Cloud-only — NOT on Bedrock / Vertex / Foundry / ZDR

Note that ultracode and ultrathink are current keywords, not deprecated: ultracode triggers a workflow / xhigh effort, and ultrathink requests deeper reasoning for one prompt.

Watch out

Research preview / experimental means: verify before you depend on it

Several of these can change behavior, flags, or even disappear between versions. Before building anything important on workflows, teams, agent view, or routines: confirm your version (claude --version), check /help and the live docs, and have a fallback. Treat the caps (1,000 / 16 / 3-5) and cost figures as point-in-time.

Try it: Classify five tasks, then run the cheapest paradigm that fits

Practice the decision rule, then validate one choice hands-on. No experimental features are required to learn the skill — classifying correctly is the point.

Classify (paper exercise). For each task below, write down the paradigm (subagent / workflow / agent team / background agent) AND the one-line reason, keyed off the decision rule:
- a. "Find every place we still import the deprecated logger across the whole repo and give me one report."
- b. "A backend, a frontend, and a test-writer build this feature together, handing work back and forth."
- c. "While I'm in a meeting, keep retrying the flaky integration test and tell me when it's green."
- d. "Map the call graph of the billing module so I can plan a refactor."
- e. "Audit all 300 API handlers for missing auth checks, verify each finding, and filter out false positives."
Predict the cost ordering. Rank your five tasks from cheapest to most expensive per active agent, and justify the ranking using the context model (summary-back vs. script-variables vs. full-window-per-teammate).
Run the safe one. Pick the task you classified as a subagent (likely d) and actually do it in a repo: ask Claude to spawn a subagent that maps the call graph and reports a summary. Run /context afterward and note how little raw exploration landed in your main window.
Check availability honestly. Run claude --version and /help. For each of workflows, agent teams, and agent view, write down: is it present in your version? Is it research preview / experimental / off-by-default? Cloud-only or local?
(Optional, only if safe to experiment.) If your version supports it and you're not on Bedrock/Vertex/Foundry, try /deep-research <a small, bounded question> and watch /workflows — observe that intermediate agent work stays out of your main context and only the cited report lands.

Deliverable: your five classifications with reasons, your cost ranking with justification, and a two-sentence note on which features were actually available in your installed version (and their status) versus what this lesson described.

Key takeaways

1There are FOUR multi-agent paradigms: subagents (in-session delegates, summary back), dynamic workflows (scripted fan-out, up to 1,000 agents / 16 concurrent, state in script variables), agent teams (peer SESSIONS with a mailbox + shared task list), and background agents / agent view (human-supervised independent sessions).
2The decision rule: deterministic large-scale fan-out → workflow; agents must talk / divide a fuzzy problem → agent team; run while you're away → background agents, /loop, or routines; isolate one heavy read → subagent.
3Communication is the master variable: subagents and background agents have no peer messaging; workflows glue outputs with string concatenation (no message bus); only agent teams have a real mailbox + shared task list.
4Cost scales ~linearly with active agents — teammates each hold a full context window (most expensive per agent), workflows keep intermediate state in script variables, subagents summarize back; always test on a small slice first.
5'Agent fleet' and 'managed agents' are NOT official terms — the official term for coordinated multi-session work is 'agent teams'.
6Status honesty: workflows and agent view are research preview, agent teams are experimental (off by default), and ultraplan/ultrareview/routines are cloud-only (no Bedrock/Vertex/Foundry/ZDR) — defer to /help and the live docs for your version.

Quiz

Lock in what you learned

Check your understanding

0 / 4 answered

1.You need to audit 500 source files for a specific insecure pattern, verify each hit, and get a single combined report. No agent needs to talk to any other — it's the same check applied many times. Which paradigm fits best?

2.A colleague says they used an 'agent fleet' to coordinate several Claude Code sessions that messaged each other and shared a task list. What's the correct framing?

3.Across the paradigms, which statement about context and cost is accurate?

4.Before relying on agent teams for important work, what's the most accurate thing to keep in mind about their status?

Go deeper

Hand-picked sources to keep learning

Dynamic workflows (official docs)

The scripting orchestrator: triggers (ultracode), the 1,000 / 16 caps, /workflows monitoring, /deep-research, and the conceptual primitives.

Agent teams (official docs)

The experimental coordinated-session paradigm: enabling it, the shared mailbox + task list, display modes, and team hooks.

Agent view / background agents (official docs)

claude agents control room, claude --bg dispatch, git-worktree isolation, and supervising unattended sessions by state.

Subagents (official docs)

The single-delegate floor this lesson builds above — isolated context and summary-back behavior.

Building a C compiler with Claude Code (Anthropic engineering)

Real-world agent-team scale: ~2,000 sessions over ~2 weeks producing a ~100k-line compiler. Treat the cost figure as point-in-time.

claude-code on GitHub

Changelog and issues — the place to confirm caps, flags, and which features are still research preview for your version.