PhyseaWiki How AI actually works Papers physea.ai →

Workflows & orchestration

Why is reliability the core problem in multi-step AI workflows?

A workflow connects models and tools along defined paths so a multi-step task stays predictable. Reliability is the core problem because it compounds: per-step success multiplies, so a long chain of even good steps decays fast.

Last updated 2026-06-15 · Physea Labs

A workflow wires models and tools together along defined paths. It is the structured, predictable cousin of an agent: instead of letting the model decide every step, a workflow lays the steps out in advance. Anthropic frames the choice as exactly this, between workflows that follow predefined code paths and agents that direct themselves, and advises starting with the simplest approach that works.[1]

The reason orchestration exists comes down to multiplication. If a single step succeeds 95% of the time, a chain of twenty such steps does not succeed 95% of the time. It succeeds about 0.95²⁰, which is roughly 36%. Push the per-step reliability to 99% and twenty steps still only reach about 82%.

100% 0 number of steps in the chain → 50% 510152025 p=0.99 p=0.95 p=0.90 0.95²⁰ ≈ 36%
Success is per-step reliability raised to the number of steps. Long chains decay fast — which is why orchestration exists.

The curve is unforgiving, and it is why a capable model can still produce an unreliable system once you string enough calls together. Every structural pattern below is, at heart, a way to bend that curve.

Orchestration frameworks

  • LangGraph

    Graph-based orchestration for stateful, multi-step pipelines.

  • CrewAI

    Coordinates multiple role-based agents toward one goal.

  • OpenAI Agents SDK

    Building blocks for sequential and handoff-style orchestration.

  • Temporal

    Durable execution: persists workflow state so steps retry instead of restarting.

References

  1. Building Effective Agents — Anthropic