Multi-Agent Architecture Patterns

Beyond Single Agents

Single-agent systems hit limits. One LLM cannot simultaneously research, plan, code, test, and deploy. Multi-agent architectures decompose complex tasks into specialised agents that communicate, coordinate, and sometimes compete. The result is systems that handle complexity no single model can manage.

The analogy is a team, not a person. A single brilliant person can do many things, but a team of specialists can do more.

The single-agent approach works for simple tasks: answer a question, generate text, classify content. But complex tasks require multiple capabilities: research, reasoning, planning, execution, and verification. A single model can do all of these, but not simultaneously and not at scale. Multi-agent systems assign each capability to a specialised agent, then coordinate the agents to achieve the overall goal.

Pattern 1: Hierarchical Control

A director agent decomposes high-level goals into subtasks, assigns them to worker agents, and synthesises results. Workers do not communicate with each other - only with the director. This pattern is predictable and debuggable, making it ideal for regulated environments.

Best for: Task decomposition with clear dependencies, audit requirements, and situations where a single entity must remain accountable.

Bottleneck Risk: The director becomes a bottleneck. Complex workflows with many parallel tasks suffer from coordination overhead. We typically use the most capable model (GPT-4, Claude 3 Opus) for the director and smaller, faster models (GPT-3.5, Claude 3 Haiku) for workers.

Pattern 2: Collaborative Networks

Agents communicate through a shared message bus or blackboard, broadcasting partial results and subscribing to relevant updates. No central controller - agents self-organise around shared goals.

Best for: Open-ended research, creative tasks, and problems where the solution path is not known in advance.

Unpredictability: Agents can loop, contradict each other, or converge on suboptimal solutions. Requires careful termination conditions. We typically set a maximum of 20 interactions per task.

The blackboard pattern is the implementation. A shared data structure stores hypotheses, evidence, and conclusions. Agents read the current state, contribute new information, and update their beliefs. The blackboard enforces structure: each contribution has a type (hypothesis, evidence, conclusion), a source (which agent), and a confidence score.

Pattern 3: Competitive Markets

Multiple agents propose solutions to the same problem. A judge agent evaluates proposals against criteria and selects the best. This harnesses diversity of approach - different agents may use different reasoning strategies.

Best for: Optimisation problems, creative generation with quality thresholds, and situations where solution quality matters more than speed.

Cost Multiplication: Running multiple agents for every task multiplies compute costs. Requires intelligent pruning of unpromising branches. Use competitive markets sparingly.

Code Generation Example: One agent writes functional code, another writes efficient code, a third writes readable code. The judge evaluates against test cases, performance benchmarks, and style guidelines. The winning code passes all tests, performs within 10% of the fastest, and follows style guidelines.

Communication Protocols

Agents need structured ways to communicate:

Message schemas. Define standard message formats with sender, recipient, intent, content, and confidence. Prevents ambiguity. The schema should be versioned: as agents evolve, the message format may need to change. Versioning ensures backward compatibility.
Shared state. A central store for facts, hypotheses, and conclusions. Agents read from and write to shared state rather than passing messages directly. Shared state reduces message volume: agents do not need to broadcast updates; they write to the store and others read it.
Capability registration. Agents advertise their capabilities. The orchestrator assigns tasks based on registered skills rather than hardcoded routing. This enables dynamic teams: add a new agent with new capabilities, and the orchestrator automatically uses it.

The communication layer is often the weak point. Agents that cannot communicate clearly cannot coordinate effectively. We invest heavily in message schemas: well-defined types, validation rules, and error handling. A message that fails validation should be rejected, not processed with assumptions. Ambiguity in communication leads to errors in execution.

Failure Modes

Infinite loops. Agents ping-pong without converging. Fix: maximum interaction limits, convergence criteria, and timeout mechanisms. We typically set a maximum of 20 interactions per task. If the task is not complete after 20 interactions, the director intervenes or the task fails.
Cascading errors. One agent's incorrect output propagates through the system. Fix: validation layers between agents, confidence thresholds for acceptance. Each agent's output is validated before being passed to the next agent.
Deadlock. Agents wait for each other. Fix: asynchronous messaging with timeouts, fallback behaviours when dependencies fail. If an agent does not respond within 30 seconds, the orchestrator assigns the task to another agent or fails the task.

Monitoring is Essential: Each agent logs its inputs, outputs, and errors. The orchestrator logs task assignments, completions, and failures. We aggregate these into a single view: task ID, agent assignments, timeline, and outcome. When a task fails, we can trace the exact sequence of agent interactions and identify the failure point.

Our Recommendation

Start with hierarchical control. It is the most predictable, easiest to debug, and simplest to explain to stakeholders. Move to collaborative networks only when hierarchical control becomes a bottleneck. Use competitive markets sparingly — the cost is high and the benefit is situational.

The tooling landscape is evolving. LangChain, LlamaIndex, and CrewAI provide multi-agent frameworks, but they are immature. We often build custom orchestration because the frameworks do not yet handle failure modes, monitoring, and debugging well.