Planning & Task Decomposition

Turning a big goal into doable steps

Intermediate 13 minBuilder

What you'll be able to do

Decompose a complex goal into subgoals using sequential, parallel, and as-needed strategies
Compare reactive (ReAct) loops with deliberative plan-and-execute and pick the right one per task
Model a task as a dependency graph (DAG) and explain why graphs enable parallelism and partial recovery
Implement dynamic, localized re-planning that fixes a failed step without restarting the whole task
Judge when planning helps and when it adds latency and cost without benefit

At a glance

A single language model call cannot reliably book a five-leg trip, refactor a codebase, or research fifty sources — the goal is too big to hold in one thought. Planning and decomposition are how an agent breaks a goal into ordered, doable subgoals, then executes them with the ability to re-plan when reality disagrees. This lesson contrasts reactive loops with deliberative planning, shows how modern systems model tasks as dependency graphs, and teaches you when an upfront plan earns its cost — and when it is pure overhead.

1Big goals don't fit in one thought
2Two paradigms: react vs. plan-ahead
3Sequential, parallel, and as-needed
4Plans are graphs, not just lists
5When a step fails: re-plan, don't restart
6When planning earns its cost

Big goals don't fit in one thought

Think about how you'd tackle "plan a three-city research trip and book it under $2,000." You don't do it in one move — you pick cities, check flight prices, compare hotels, sequence the route, then reserve. An agent faces the same wall: there is no single action that satisfies the whole goal, so the goal has to be split. That splitting is task decomposition — turning one large, vague goal into a set of smaller subgoals, each small enough that the agent can actually execute it.

The mental model is a manager and a to-do list. The manager (the planning step) doesn't do the work; it figures out what work exists and in what order. Each item on the list is concrete enough that a worker — a tool call, a sub-agent, or a single focused model turn — can finish it.

Decomposition matters for two reasons. First, reliability: models reason far better over a sequence of small, well-scoped steps than over one sprawling instruction. Second, structure: once a goal is a list of subgoals, you can track progress, parallelize independent work, retry a single failed step, and inspect the agent's strategy before it spends money acting. Without decomposition, a complex task is an opaque, all-or-nothing gamble.

Two paradigms: react vs. plan-ahead

There are two fundamentally different ways an agent can sequence its work, and the difference is basically improvise-as-you-go versus map-it-out-first.

Reactive (ReAct-style) decides one step at a time. The loop is Thought → Action → Observation, repeated: the model reasons, takes a single action, sees the result, and only then decides the next step. It never commits to a full plan — it improvises with constant feedback, like a driver taking each turn as the road reveals it.

Deliberative (plan-and-execute) writes the whole plan first. One LLM call produces an ordered list of steps; then an executor — often a smaller, cheaper model or plain tool runner — works through them, calling back to the planner only when needed. This is like printing turn-by-turn directions before leaving the house.

	Reactive (ReAct)	Deliberative (Plan-and-Execute)
Decides	One step at a time	Whole plan upfront
Adapts to surprises	Immediately	Needs explicit re-planning
LLM calls	One per action (expensive on long tasks)	One plan + cheap execution
Global view	None — no overall map	Yes — inspectable strategy
Best for	Short, unpredictable tasks	Long, structured, multi-tool tasks

Neither wins universally. ReAct is robust and cheap on short tasks but bleeds tokens over long chains and has no map of where it's going. Plan-and-execute is cheaper at scale and inspectable, but a plan written before the first observation can be wrong from step one.

Key insight

They aren't mutually exclusive

The mainstream 2025–2026 approach is hybrid: high-level deliberative planning for the overall route, plus a small reactive loop inside each subtask to handle local surprises. You get the global map of plan-and-execute and the adaptability of ReAct at the same time.

Sequential, parallel, and as-needed

Once you decide to split a goal, the next question is how the pieces relate to each other — and that relationship decides your architecture. Decomposition is not one technique; subgoals fit three patterns, and recognizing which you have changes how you run them.

Sequential — each subgoal depends on the previous one's output. Book the flight, then book the hotel near the arrival airport, then schedule meetings around the flight times. You must run these in order.
Parallel — subgoals are independent and can run simultaneously. Get the weather for three cities is three calls that don't wait on each other. Running them in parallel is a major latency win.
Asynchronous fan-out / fan-in — independent branches launch together but a later phase must wait for all of them. Research five competitors in parallel, then write one comparison once every branch returns.

A fourth idea governs when you decompose at all. The intuition: don't pre-chop a task you might be able to do in one bite. ADaPT (As-Needed Decomposition and Planning, NAACL 2024) showed exactly this — the agent tries a subtask directly and only decomposes it recursively when it fails to execute. This demand-driven approach beat static plan-and-execute baselines substantially on ALFWorld, WebShop, and TextCraft — because it spends decomposition effort only where the task is actually hard.

Watch out

Match granularity to the executor

Over-decomposition (trivial micro-steps like "open the browser," "focus the search box") creates coordination overhead with no benefit. Under-decomposition leaves the executor a vague instruction it can't act on. The right grain is the smallest step your executor can reliably complete in one shot — no finer.

Plans are graphs, not just lists

A flat numbered list tells you the steps but hides which ones actually depend on each other — so you end up running everything in a slow single-file line. Modern planners fix this by modeling a task as a directed acyclic graph (DAG): nodes are subtasks, edges are dependencies, and "acyclic" just means the arrows never loop back on themselves. Sequential subtasks form a chain; independent ones fan out as parallel branches. The graph makes three things possible that a list can't: scheduling independent nodes in parallel, re-running only a failed subtree instead of the whole task, and reasoning explicitly about what blocks what.

LLMCompiler (arXiv 2312.04511) made this concrete. A planner emits a dependency graph where steps reference prior outputs with symbols like $1, $2; a Task Fetching Unit dispatches every ready (unblocked) task in parallel; executors run them. The result was up to a 3.7× speedup over sequential plan-and-execute, purely by exploiting parallelism the DAG exposed.

python

# A plan as a DAG. Each task lists the tasks it depends on.
plan = {
    "t1": {"action": "flight_price", "args": {"city": "Tokyo"},  "deps": []},
    "t2": {"action": "flight_price", "args": {"city": "Seoul"},  "deps": []},
    "t3": {"action": "flight_price", "args": {"city": "Taipei"}, "deps": []},
    # t5 fans in: it needs all three prices before it can compare.
    "t5": {"action": "pick_cheapest", "args": {"of": ["$t1", "$t2", "$t3"]}, "deps": ["t1", "t2", "t3"]},
}

def ready(tasks, done):
    return [tid for tid, t in tasks.items()
            if tid not in done and all(d in done for d in t["deps"])]

# t1, t2, t3 are ready immediately and run in parallel; t5 waits for the fan-in.

Frameworks like LangGraph turn this idea into production infrastructure: a stateful graph of nodes and edges with checkpoints, human-in-the-loop interrupts, and dedicated re-planning nodes.

When a step fails: re-plan, don't restart

Any plan written before the agent takes its first action is really just a guess about how the world will behave — and the world doesn't always cooperate. A flight sells out, a page won't load, a tool returns garbage. Dynamic (closed-loop) re-planning is the agent watching its own execution and revising the plan the moment an observation contradicts it.

The critical design choice is how much to re-plan. The PlanGenLLMs survey (Feb 2025) splits closed-loop planning into two modes:

Implicit / localized — fix only the failed step (or its subtree), keeping every completed node intact. Cheap, and the default best practice.
Explicit / full — regenerate the entire plan from scratch. Prevents error accumulation when a failure invalidates everything downstream, but burns far more tokens.

A naïve agent that restarts from the beginning on every failure wastes all prior work. Confining re-planning to the failed sub-task node — localized re-planning — has been shown to cut token consumption dramatically (one 2025 task-graph paper reports up to 82%). The graph structure is what makes this possible: completed nodes stay done; only the broken branch is regenerated.

Watch out

Guard against failure loops

Re-planning that never makes progress — fix, fail, re-plan, fail again — is a classic agent death spiral. Defend with a max re-plan count, immutable plan versions so you can diff and detect non-progress, and explicit backtracking rules that abandon a dead branch instead of retrying it forever.

When planning earns its cost

Here's the catch nobody mentions in the demos: planning is not free. Before the agent does anything useful, an upfront planning phase adds an extra LLM call, latency while you wait for the plan, and the risk that the plan is stale the moment it's written. So treat planning as a cost you have to justify, not a default. That overhead pays off only for complex, multi-step, multi-tool tasks with a fairly predictable structure — the kind where a global map genuinely helps.

For short, simple, or highly unpredictable tasks, skip it. A one-or-two-tool query ("what's the weather, then convert to Fahrenheit") is faster and cheaper with a plain ReAct loop, or even direct prompting. Anthropic's Building Effective Agents makes this the headline rule: start simple, and add planning layers only when simpler solutions demonstrably fall short. Their orchestrator-worker pattern — a central LLM decomposes a task, delegates subtasks to workers, and synthesizes the results — is exactly the deliberative pattern, recommended only when the task warrants it.

A practical decision rule:

≤ 2 tools, predictable path → direct call or ReAct.
Many steps, clear dependencies, parallelism available → plan-and-execute over a DAG.
Long-horizon and messy → hybrid: plan the skeleton, react within each subtask, re-plan locally on failure.

When in doubt, reach for the least planning that reliably solves the task.

Try it: Build a plan-and-execute trip planner with localized re-planning

Write a small Python agent that plans a 3-city trip and survives a failure without restarting.

Plan. Prompt an LLM to emit a JSON plan as a list of tasks, each with an id, an action, args, and a deps list (a DAG). Include at least one fan-out (three independent flight_price lookups) and one fan-in (pick_cheapest).
Schedule. Write a ready(tasks, done) function that returns every task whose dependencies are all complete, and run those tasks — execute the independent ones concurrently (e.g., asyncio.gather).
Inject a failure. Make one flight_price tool raise on its first call. Implement localized re-planning: re-plan ONLY the failed node (retry or substitute an alternate city), keep every completed node's result, and continue. Do NOT restart the whole plan.
Guard the loop. Add a max_replans cap so a persistently failing node can't spin forever; on exceeding it, backtrack — drop that branch and proceed with what you have.
Reflect (2–3 sentences). How many LLM calls did localized re-planning save versus a full restart? For this task, would a plain ReAct loop have been simpler? Why or why not?

Key takeaways

1Decomposition turns one opaque, all-or-nothing goal into small, executable subgoals you can track, parallelize, retry, and inspect.
2Reactive (ReAct) loops adapt one step at a time; deliberative plan-and-execute writes the whole plan upfront — hybrids combine both and dominate in practice.
3Modeling a plan as a dependency DAG unlocks parallel execution and partial recovery; LLMCompiler showed up to a 3.7× speedup over sequential execution.
4On failure, re-plan locally — regenerate only the failed subtree — instead of restarting, and guard against failure loops with re-plan caps and backtracking.
5Planning only pays off for complex, multi-step, structured tasks; for short or unpredictable ones it just adds latency and token cost.

Quiz

Lock in what you learned

Check your understanding

0 / 4 answered

1.What is the defining difference between reactive (ReAct) and deliberative (plan-and-execute) agents?

2.Why do modern planners model tasks as directed acyclic graphs (DAGs) instead of flat lists?

3.A plan step fails midway through a long task. What is the recommended best practice as of 2024–2025?

4.When does adding an upfront planning phase typically NOT pay off?

Go deeper

Hand-picked sources to keep learning

Anthropic — Building Effective Agents

Canonical guidance on orchestrator-worker decomposition, prompt chaining, and the start-simple rule for adding planning.

ADaPT: As-Needed Decomposition and Planning with Language Models (NAACL 2024)

Recursive, demand-driven decomposition — decompose a subtask only when the model fails to execute it directly.

An LLM Compiler for Parallel Function Calling (LLMCompiler)

DAG-based task planning with a Task Fetching Unit for parallel execution; up to 3.7× speedup over sequential plan-and-execute.

PlanGenLLMs: A Modern Survey of LLM Planning Capabilities (Feb 2025)

Comprehensive 2025 survey of open-loop vs closed-loop (implicit/explicit) planning, decomposition modes, and evaluation.

LangChain — Plan-and-Execute Agents

Practical walkthrough of the planner / executor / replanner architecture, DAG advantages, and LangGraph patterns.

Understanding the Planning of LLM Agents: A Survey (arXiv 2402.02716)

Taxonomy of agent planning approaches: decomposition, multi-plan selection, external modules, reflection, and memory.