Agent Architecture

agentic

architecture

Published

May 3, 2026

What Is an Agent?

The term “agent” has been heavily hyped and people use it to mean many different things. There is no single universally agreed definition, but a crisp one comes from Hugging Face’s SmolAgents project:

AI agents are programs where LLM outputs control the workflow.

In other words, the output of one LLM call is able to decide what tasks are carried out and in what sequence. That is the core idea: the model is not just answering a question, it is steering what happens next.

Hallmarks of Agentic AI

More broadly, people tend to use the term “agentic AI” when any one of the following hallmarks is present:

Multiple LLM calls - Any solution that chains together more than one LLM invocation. Even a simple pipeline where one call generates content and another evaluates it could be called agentic.
Tool use - An LLM with the ability to call external tools (APIs, databases, code execution, device control). For many people this is the litmus test for whether something qualifies as agentic.
Multi-agent coordination - An environment where different LLMs send information to each other, a kind of orchestration layer that allows models to collaborate.
Planning - A process (itself an LLM) that coordinates activities, decides on ordering, and delegates subtasks. This is closely related to multi-agent coordination but emphasizes the planning step.
Autonomy - Giving an LLM the ability to control what order things happen in, to “choose its own adventure.” This is the hallmark that captures the essence of agentic AI for many practitioners. We are giving the model agency: the freedom to decide how future actions will be carried out.

Autonomy sounds dramatic, but in practice it can be as simple as letting a model choose which business sector to analyze, or letting it decide which tool to call next. Any time an LLM’s output determines what happens downstream, you could describe that as a degree of autonomy.

Workflows vs. Agents

Anthropic’s Building Effective Agents blog post introduces a helpful distinction. They use the umbrella term agentic systems and split it into two categories:

Workflows - Systems where models and tools are orchestrated through predefined paths. The developer decides the sequence of steps ahead of time. The LLM executes within that structure but does not choose the structure itself.

Agents - Systems where models dynamically direct their own processes and tools, maintaining control over how tasks get accomplished. The model decides what to do, in what order, and when to stop.

Many things that people casually call “agents” are, by this definition, actually workflows. The distinction matters because the design patterns, failure modes, and debugging strategies are quite different for each.

Both workflows and agents fall under the “agentic systems” umbrella. Whether a workflow is truly “agentic” is a bit of wordplay, but the framework is useful for thinking about how much control you are handing to the model versus keeping in your own code.

Workflow Design Patterns

Anthropic identifies five common design patterns for workflows. Each one represents a different way to orchestrate LLMs through predefined paths. It is worth noting that the line between workflows and agents is blurry. Several of these patterns give the LLM some discretion, even though they are technically classified as workflows.

Prompt Chaining

Prompt chaining decomposes a task into a sequence of steps. Each LLM call processes the output of the previous one. You can optionally add programmatic checks (gates) between steps to ensure the process stays on track.

flowchart LR
  Input((Input)) --> LLM1[LLM1] --> Gate[[Gate]] --> LLM2[LLM2] --> LLM3[LLM3] --> Output((Output))

Prompt chaining. Each LLM call feeds into the next, with optional code gates between them.

This pattern works well when a task can be cleanly decomposed into fixed subtasks. The benefit is that you can frame each LLM call very precisely, getting the most effective response at each step. It keeps the whole process on guardrails by taking it step by step through a sequence of well-defined tasks.

An element of autonomy can still exist here. The first LLM might choose a topic, and that choice determines what the subsequent LLMs work on. So the boundary between “workflow” and “agent” is not always sharp.

Routing

Routing uses an LLM to classify an input and direct it to the appropriate specialist model. Each specialist is optimized for a different type of task.

flowchart LR
  Input((Input)) --> Router[Router LLM]
  Router --> LLM1[Specialist 1]
  Router --> LLM2[Specialist 2]
  Router --> LLM3[Specialist 3]
  LLM1 --> Output((Output))
  LLM2 --> Output((Output))
  LLM3 --> Output((Output))

Routing. A router LLM classifies the input and sends it to the appropriate specialist.

This pattern allows separation of concerns. Different LLMs can have different levels of expertise, and the router decides which expert is best equipped for the current task. It is very common in production systems, for example routing easy questions to a smaller model and hard questions to a more capable one.

Parallelization

Parallelization uses code (not an LLM) to split a task into multiple pieces that run concurrently. The results are then aggregated by code.

flowchart LR
  Input((Input)) --> Splitter[[Code: Split]]
  Splitter --> LLM1[LLM1]
  Splitter --> LLM2[LLM2]
  Splitter --> LLM3[LLM3]
  LLM1 --> Aggregator[[Code: Aggregate]]
  LLM2 --> Aggregator
  LLM3 --> Aggregator
  Aggregator --> Output((Output))

Parallelization. Code splits the task, LLMs work in parallel, code aggregates the results.

The key difference from routing is that code does the orchestration, not an LLM. The parallel tasks can be different subtasks, or they can be the same task run multiple times (voting) to get diverse outputs and pick the best one.

Orchestrator-Workers

This pattern looks similar to parallelization, but with one critical difference. An LLM (not code) breaks down the task and an LLM synthesizes the results.

flowchart LR
  Input((Input)) --> Orchestrator[Orchestrator LLM]
  Orchestrator --> W1[Worker LLM1]
  Orchestrator --> W2[Worker LLM2]
  Orchestrator --> W3[Worker LLM3]
  W1 --> Synthesizer[Synthesizer LLM]
  W2 --> Synthesizer
  W3 --> Synthesizer
  Synthesizer --> Output((Output))

Orchestrator-Workers. An LLM breaks down the task, worker LLMs execute subtasks, and an LLM synthesizes the results.

Because an LLM is doing the orchestration, this is a much more dynamic system. The orchestrator can choose how to divide the task and how many workers to assign. This makes it arguably closer to an agent pattern than a workflow, since the orchestrator has real discretion. But it is still classified as a workflow because the overall structure (break down, execute, synthesize) is predefined.

Evaluator-Optimizer

This is a feedback loop pattern. One LLM generates a solution, and a second LLM evaluates it. If the evaluator accepts the work, it goes to the output. If it rejects it, the rejection and the reason go back to the generator for another attempt.

flowchart LR
  Input((Input)) --> Generator[Generator LLM]
  Generator --> |Solution| Evaluator[Evaluator LLM]
  Evaluator -->|accept| Output((Output))
  Evaluator -->|reject + reason| Generator

Evaluator-Optimizer. The generator produces output, the evaluator checks it, and rejects it back with feedback if needed.

This is one of the most powerful patterns for building production systems. It directly addresses the concern of accuracy and robustness. There are never full guarantees with LLMs, but having an evaluator in the loop builds a higher level of confidence in the quality of the final output.

The Agents

By contrast with workflow patterns, the agent pattern is open-ended. It has feedback loops. Information comes back and is processed multiple times. There is no fixed path through the design. It is fluid and dynamic.

The core loop looks like this:

flowchart LR
  Human -->|task| LLM
  LLM -->|call| Tools[Tools / Environment]
  Tools -->|result| LLM
  LLM --> Stop([Stop])
  Stop -->|output| Human

The autonomous agent loop. The LLM repeatedly calls tools and processes feedback from the environment until it decides to stop.

A human makes a request. The LLM takes an action on the environment (calling a tool, querying a database, controlling a device). The environment returns feedback. The LLM processes that feedback and decides whether to take another action or stop. This loop repeats as many times as the model deems necessary.

There are no more specific sub-patterns here because the agent pattern is itself a meta-design: the LLM gets to choose its own approach to solving the problem. That is what distinguishes it from workflows, where the developer prescribes the path.

This flexibility means agents can take on much larger, harder problems than workflows. But it also means less predictability. With a workflow, there is certainty about what is happening and why. With an agent, the process is fluid by nature.

Challenges of Agentic Systems

Giving LLMs autonomy is powerful, but it introduces uncertainty. The same flexibility that allows agents to tackle complex, open-ended problems also means you lose some of the predictability you get with workflows. Here are the main challenges.

Unpredictable path - You do not know what order tasks will happen in, or even what tasks will happen. There are no guarantees about the quality of the output.

Unpredictable cost - Because you do not know how long the agent will run, you do not know how much it will cost in API calls. An agent could loop many times before completing (or failing to complete) its task.

Unpredictable output - There are no guarantees about the quality of the final result. The same task run twice might produce different outputs of varying quality.

Completion uncertainty - There is no guarantee the agent will finish at all, or finish within a reasonable time.

Mitigations

These challenges are real but manageable. Two key practices help keep agentic systems under control.

Monitoring and Observability

You need visibility into what is happening behind the scenes: what calls are being made, what tools are being used, how agents are interacting with each other. Tools like OpenAI’s tracing (in the Agents SDK) and LangSmith (for LangGraph) provide this kind of observability. Without it, debugging multi-agent systems is nearly impossible.

Guardrails

Guardrails are protections written in software that ensure models behave safely, consistently, and within the boundaries you define. They prevent agents from leaving the rails you put in place. The OpenAI Agents SDK, for example, has dedicated functionality for guardrails, ensuring agents “behave safely, consistently and within the boundaries that you wish.”

Guardrails can include:

Input validation (rejecting malformed or dangerous requests)
Output validation (checking that responses meet quality or safety criteria)
Cost limits (stopping execution after a budget is exceeded)
Time limits (forcing termination after a maximum duration)
Scope limits (restricting which tools an agent is allowed to call)