Agentic Systems
In early 2024, a fintech startup announced they'd built a "fully autonomous customer service department"—a swarm of 12 specialized AI agents handling everything from account inquiries to fraud detection. The architecture diagram looked impressive: a Triage Agent routing to a Billing Agent, which could escalate to a Fraud Agent, which could loop back to a Resolution Agent. Six months later, they quietly replaced it with two agents and a human escalation queue. What went wrong?
The agents worked fine in isolation. But together, they created a game of telephone. Context got lost between handoffs. The Billing Agent would ask customers to re-explain problems they'd already described to the Triage Agent. The Fraud Agent flagged legitimate transactions because it never received the customer's purchase history. Worst of all, errors compounded—a small misunderstanding in one agent snowballed into completely wrong conclusions three agents later.
This chapter is about avoiding that fate. We'll learn how to compose agents into systems that actually work—when to add complexity, how to share context, and why the best agentic systems often look simpler than you'd expect.
The Human Analogy Trap
When we think about multi-agent systems, our minds naturally reach for organizational metaphors. "I'll have a Marketing Agent, a Sales Agent, and a Support Agent—just like departments in a company!" This intuition is a trap.
Human organizations work despite coordination overhead because humans have:
- Shared context by default: Everyone knows the company, the product, the culture
- Implicit communication: Body language, hallway conversations, shared documents
- Self-correction: A confused employee asks for clarification
AI agents have none of this. Each agent starts with a blank slate. They only know what you explicitly tell them. They can't sense when they're confused—they confidently produce wrong outputs.
The right mental model isn't "building a team." It's software architecture. You're designing a system where each component has clear inputs, outputs, and responsibilities. The questions aren't "What role should this agent play?" but rather:
- What specific capability does this component need?
- What information does it require to function?
- How do we verify its output before passing it downstream?
Think less like an HR manager staffing departments, and more like a systems engineer designing microservices.
The Evolution of a System
Let's return to our Zenith Motors sales agent from the previous chapter. We built an agent that could:
- Understand customer preferences
- Search the product database
- Follow sales procedures
- Remember customer context
It works well for straightforward conversations. But as we deploy it to more customers, we discover limitations. Let's watch how a simple agent evolves into a multi-agent system—not through grand design, but through solving real problems.
Problem 1: The Emails Are Terrible
Our sales agent can chat, but when customers request a follow-up email summarizing the conversation, the results are... underwhelming. Generic, poorly structured, missing key details. We have two options:
Option A is simpler—just add more instructions to the system prompt. But the prompt is already long, and cramming in email-writing expertise dilutes the agent's focus on sales conversations.
Option B separates concerns. The sales agent focuses on selling. When an email is needed, it delegates to a specialist. This is the agent-as-tool pattern—the Email Writer Agent becomes a tool the Sales Agent can call.
But we can do better. Remember the Evaluator-Optimizer loop from the Workflows chapter? We can apply it here:
The Sales Agent requests an email. The Writer drafts it. The Evaluator scores it on clarity, personalization, and call-to-action. If it's below threshold, the Writer revises. This loop runs until quality is acceptable—then the polished email returns to the Sales Agent.
Extract a capability into a separate agent when: (1) It requires specialized expertise that clutters your main prompt, (2) You want independent quality control, or (3) You might reuse it elsewhere. Don't extract for the sake of architecture—extract when you have evidence of a problem.
Problem 2: Financial Questions Are Out of Scope
Customers start asking about financing: "What's the interest rate?" "Can I lease instead of buy?" "What's the monthly payment on $950K over 5 years?"
Our sales agent wasn't trained for this. It either hallucinates numbers or awkwardly deflects. We need a Financial Advisor Agent:
When the Sales Agent detects a financing question, it delegates to the Financial Agent—passing along the relevant context (vehicle price, customer budget). The Financial Agent has access to current rate tables and payment calculators. It returns a structured answer that the Sales Agent incorporates into the conversation.
The customer experiences one seamless conversation. Behind the scenes, two specialists collaborated.
Problem 3: We Need a Front Door
Zenith Motors expands. Now they need to handle:
- Sales inquiries (new agent)
- Service appointments (different agent)
- Roadside assistance (yet another agent)
We can't have customers figure out which agent to talk to. We need a router—a Concierge Agent that greets customers and directs them to the right specialist:
The Concierge Agent's job is simple: understand intent and route. It doesn't need product knowledge or policy access—just good classification and a graceful handoff.
The Pattern: Organic Growth
Notice how our system evolved:
- Started simple: One agent with tools
- Hit a limitation: Email quality was poor
- Added capability: Email Writer + Evaluator loop
- Hit another limitation: Couldn't handle financing
- Added specialization: Financial Agent as a tool
- Scaled scope: Multiple customer needs
- Added routing: Concierge Agent as front door
Each addition was driven by evidence of a problem, not architectural ambition. This is how good agentic systems grow—organically, in response to real failures.
The Shared Context Problem
Here's the uncomfortable truth about multi-agent systems: every handoff is an opportunity for failure.
When a human sales team works together, they share an office, overhear conversations, read the same Slack channels. Context flows naturally. When Agent A hands off to Agent B, context doesn't flow at all—unless you explicitly engineer it.
Let's look at three failure modes that plague multi-agent systems:
Failure Mode 1: Context Loss
A customer chats with the Concierge:
Customer: "Hi, I'm looking for an eco-friendly hypercar. I currently drive a Tesla Model S but want something more exciting. Budget is flexible but ideally under $1.5M."
The Concierge routes to Sales with the message: "Customer interested in purchasing a vehicle."
The Sales Agent responds: "Welcome to Zenith Motors! What kind of vehicle are you looking for?"
The customer, frustrated, has to repeat everything. The handoff lost context.
Failure Mode 2: Compound Errors
A legal research system uses multiple agents:
- Query Agent: Interprets the user's question
- Search Agent: Finds relevant cases
- Analysis Agent: Synthesizes findings
A lawyer asks: "What are the precedents for breach of fiduciary duty in California?"
The Query Agent interprets this and passes to Search: "Find cases about breach of fiduciary duty."
Notice what's missing? The jurisdiction. The Search Agent dutifully finds cases—from Delaware, New York, the UK. The Analysis Agent synthesizes these into a confident-sounding answer about "established precedents."
The lawyer relies on this analysis. But California has specific statutes and case law that differ significantly from Delaware corporate law. The advice is wrong—not because any single agent failed, but because a small omission compounded into a completely incorrect conclusion.
Every agent handoff is like a game of telephone. Information degrades. Nuance disappears. Assumptions get baked in. The more handoffs in your system, the more likely the final output diverges from the original intent.
Failure Mode 3: Goal Drift
A customer support system handles a complex case:
- Triage Agent: "Customer is upset about a billing error"
- Billing Agent: Investigates, finds the error, issues a refund
- Follow-up Agent: "Check if customer is satisfied"
But the original conversation also mentioned: "And I want to cancel my subscription if this isn't fixed."
The Billing Agent fixed the error but never addressed the cancellation threat—it wasn't passed along. The Follow-up Agent cheerfully confirms the refund. The customer, still wanting to cancel, feels unheard.
The goal drifted. What started as "fix billing AND address cancellation concern" became just "fix billing." Each agent optimized for its narrow task, losing sight of the broader customer intent.
Mitigating Context Problems
These failures aren't inevitable. Here's how to engineer systems that maintain context integrity.
Strategy 1: Shared State
Instead of passing context from agent to agent (like a relay baton that can be dropped), maintain a central state store that all agents can read and write.
Every agent reads from the same source of truth. When the Concierge learns the customer wants an eco-friendly car under $1.5M, it writes that to shared state. When the Sales Agent activates, it reads shared state first—no information lost.
The shared state acts as the system's "working memory." It might include:
- Customer profile: Preferences, history, budget
- Conversation transcript: Full context, not summaries
- Current goal: What are we trying to accomplish?
- Gathered facts: What have we learned?
- Pending actions: What still needs to happen?
Strategy 2: Explicit Handoff Protocols
When one agent transfers control to another, enforce a structured handoff—a "briefing document" that explicitly captures everything the receiving agent needs.
The handoff brief isn't free-form text—it's a structured format that forces the sending agent to be explicit about key information. This catches omissions before they cause problems.
Strategy 3: Supervisor Oversight
For high-stakes systems, add a Supervisor Agent that monitors handoffs and catches context loss before it reaches the customer.
The Supervisor reviews handoffs: "Does the Sales Agent have everything it needs? Is the customer's budget included? Their preferences?" If something's missing, it loops back for clarification rather than letting a degraded handoff proceed.
Collaboration Patterns
Now that we understand the context problem, let's look at how agents can work together effectively. The pattern you choose should match the task structure.
Pattern 1: Delegation (Orchestrator-Workers)
One agent acts as the coordinator, delegating subtasks to specialists and synthesizing their results.
Best for: Tasks that decompose into independent subtasks. Research, analysis, content generation.
Example: A coding agent delegates planning to a Planner, implementation to a Coder, and verification to a Tester. The orchestrator coordinates the workflow and handles iteration when tests fail.
Context strategy: The orchestrator maintains the master context. Workers receive focused briefs and return structured results. The orchestrator is responsible for context integrity.
Pattern 2: Handoff (Agent-to-Agent Transfer)
Control passes from one agent to another, like a relay race. The first agent completes its role, then hands off to the next.
Best for: Conversations that evolve through distinct phases. Customer journeys, multi-stage processes.
Example: Customer support where a Triage Agent hands off to a Specialist, who might hand off to a Resolution Agent.
Context strategy: Critical to pass full context at each handoff. Use the briefing document pattern. Consider keeping the previous agent "on call" for questions.
When Agent A hands off to Agent B, keep Agent A in the loop—like CC'ing someone on an email. Agent A can catch if Agent B misunderstands something and intervene before the customer notices.
Pattern 3: Parallel (Simultaneous Execution)
Multiple agents work on the same problem simultaneously, and their results are aggregated.
Best for: Tasks where subtasks are truly independent and can be combined at the end. Research, data gathering, bulk processing.
Example: Competitive analysis where each agent researches a different company, then a synthesizer combines findings.
Context strategy: Each parallel worker needs the same base context (the overall goal, output format, quality criteria). Results must be aggregated thoughtfully—not just concatenated.
Not everything can be parallelized. If you're generating a slide deck and each slide is created by a different agent in parallel, you'll get a disjointed mess—slide 5 won't know what slide 4 said. Tasks with sequential dependencies must be sequential. Only parallelize truly independent work.
Choosing the Right Pattern
| Pattern | Use When | Watch Out For |
|---|---|---|
| Delegation | Clear subtasks, need synthesis | Orchestrator becoming a bottleneck |
| Handoff | Distinct phases, evolving conversations | Context loss at each transfer |
| Parallel | Independent subtasks, latency-sensitive | Coherence issues, wasted work on duplicates |
Beyond Basic Orchestration
Once you have a working multi-agent system, you can add sophistication. These patterns take agentic systems from reactive to proactive.
Proactive Agents
Most agents wait for input. Proactive agents initiate action based on conditions or schedules.
Examples:
- Sales agent that follows up with leads who haven't responded in 48 hours
- Analytics agent that generates weekly performance reports every Monday
- Monitoring agent that alerts when metrics cross thresholds
Proactive agents transform AI from "tool you use" to "colleague who takes initiative."
Supervisor Agents
A Supervisor monitors other agents and provides guidance when they struggle.
Example: A coding Supervisor reviews a junior Coder's output. If the code has issues, the Supervisor provides specific feedback: "The error handling is incomplete—add try/catch around the API call." The Coder revises until the Supervisor approves.
This is the Evaluator-Optimizer loop applied to agent management. The Supervisor doesn't do the work—it ensures quality.
Event-Driven Systems
Instead of direct agent-to-agent calls, use an event bus. Agents publish events; other agents subscribe to events they care about.
Example: When Sales Agent closes a deal, it publishes "DealClosed" event. The Onboarding Agent sees this and starts the customer onboarding flow. The Finance Agent processes the payment. The Analytics Agent updates dashboards. No agent needs to know about the others—they just react to events.
Event-driven architectures scale better than direct orchestration. Adding a new capability (e.g., "send welcome gift when deal closes") means adding a new subscriber—no changes to existing agents.
Building Agentic Systems: A Practical Guide
After all this theory, how do you actually build these systems? Here's a framework.
Start With One Agent
Seriously. Build a single agent that handles your core use case. Make it as capable as possible with tools, retrieval, and memory. Only when you have evidence that one agent can't handle something should you consider adding another.
The "Should I Add an Agent?" Checklist
Before adding a new agent, ask:
- Is this a prompt problem? Could better instructions solve this without a new agent?
- Is this a tool problem? Could a new tool (not agent) handle this capability?
- What context does the new agent need? How will you ensure it gets that context?
- How will you verify its output? What happens if it fails?
- Does the added complexity justify the benefit? Be honest.
If you can't answer all five questions clearly, you're not ready to add the agent.
The Single Capable Agent vs. Agent Swarm Tradeoff
There's a seductive appeal to swarms of specialized agents. But consider: one highly capable agent often outperforms a swarm of mediocre ones.
Devin, the AI software engineer, isn't a swarm of Planner, Coder, Tester, and Reviewer agents. It's one agent that can do all those things, switching contexts fluidly. This avoids handoff overhead, context loss, and coordination complexity.
The tradeoff:
| Single Capable Agent | Multi-Agent Swarm |
|---|---|
| ✅ No context loss | ✅ Specialized expertise per domain |
| ✅ No coordination overhead | ✅ Parallel execution possible |
| ✅ Easier to debug | ✅ Independent scaling |
| ❌ Prompt complexity grows | ❌ Handoff failures |
| ❌ Single point of failure | ❌ Harder to debug emergent behavior |
Rule of thumb: Prefer a single capable agent until you have clear evidence that task decomposition would help. Then add agents surgically, one at a time, with clear context-sharing strategies.
Common Anti-Patterns to Avoid
🚫 Agent Explosion: Adding a new agent for every capability. Soon you have 15 agents and no one understands how they interact.
🚫 Chatty Agents: Agents that constantly communicate, creating coordination overhead. If two agents need to talk constantly, they should probably be one agent.
🚫 Vague Handoffs: "Here, deal with this customer" instead of structured briefs with clear context.
🚫 No Verification: Trusting agent outputs without quality checks. Add evaluators, supervisors, or human review for high-stakes outputs.
🚫 Premature Optimization: Designing a complex multi-agent architecture before you've built and tested a simple one.
Summary
Agentic systems are powerful—but power demands discipline.
Key takeaways:
-
Resist the human analogy: Agents aren't employees. They're software components. Design them like systems, not organizations.
-
Evolve organically: Start with one capable agent. Add others only when you have evidence of specific limitations.
-
Context is everything: Every handoff risks context loss. Use shared state, structured handoffs, and supervisor oversight to maintain integrity.
-
Match patterns to tasks: Delegation for decomposable work, handoffs for sequential phases, parallel for independent subtasks.
-
Simpler is usually better: One capable agent often beats a complex swarm. Add complexity only when it provides clear value.
The fintech startup we mentioned at the start? Their mistake wasn't building a multi-agent system. It was building one before they understood where the complexity was actually needed. When they rebuilt with two agents (one for triage, one for resolution) plus human escalation, they got 90% of the benefit with 20% of the complexity.
That's the art of agentic systems: knowing when not to add another agent.
In the next chapter, we'll zoom out from architecture to explore Context Engineering—the art of managing what goes into your agents' limited attention windows, and how to maximize signal while minimizing noise.