Agent Architecture Patterns: A Production Guide for System Architects
Agent Architecture Patterns: A Production Guide for System Architects
Core Philosophy: Agents Are Software, Not Magic
The fundamental insight from analyzing 100+ production agent implementations: The most successful agents aren't the most "agentic". They're well-engineered software systems that leverage LLMs for specific, controlled transformations.
This document distills the patterns that separate production-ready agents from impressive demos.
The 12 Factors at a Glance
1. JSON Extraction as Foundation
- The most powerful LLM capability is converting natural language to structured JSON
- Everything else builds on this simple transformation
- Focus on schema design and validation over complex orchestration
2. Own Your Prompts
- Framework abstractions get you to 80%, but production requires hand-crafted prompts
- Every token matters for reliability
- Version control and test your prompts like code
3. Manage Context Windows Explicitly
- Don't blindly append to context—actively manage what the LLM sees
- Implement summarization and pruning strategies
- Context window size directly impacts reliability
4. Tools Are Just JSON and Code
- "Tool use" is a harmful abstraction—it's just JSON routing to functions
- Makes debugging trivial when you think this way
- No magic, just switch statements
5. Small, Focused Agents Over Monoliths
- Micro-agents handling 3-10 steps are far more reliable
- Compose them into larger workflows
- Each agent has clear boundaries and responsibilities
6. Own Your Control Flow
- Agents are prompts + routers + context + loops
- Don't let frameworks hide this—you need direct control
- Explicit is better than magical
7. Stateless Agent Design
- Agents shouldn't manage state—your application should
- Enables pause/resume, better testing, and production reliability
- Think functional programming principles
8. Contact Humans as First-Class Operations
- Human interaction isn't an edge case—it's core functionality
- Build it into your agent's vocabulary from day one
- "contact_human" should be as natural as any other action
9. Meet Users Where They Are
- Email, Slack, Discord, SMS—don't force new interfaces
- Agents should integrate into existing workflows
- Multi-channel from the start, not as an afterthought
10. Explicit Error Handling
- Don't blindly append errors to context
- Implement intelligent retry strategies
- Clear errors on success, summarize repeated failures
11. Separate Business State from Execution State
- Execution state: how the agent runs (steps, retries, context)
- Business state: what the agent does (user data, approvals, tasks)
- Keep them separate for clarity and maintainability
12. Find the Bleeding Edge
- Push models to their limits, then engineer reliability
- The magic happens at the boundary of capability
- Create value others can't easily replicate
Visual Overview of the 12 Factors
The 12-Factor Agent Framework
The 12 factors are organized into four key categories:
🏗️ Foundations
- Factor 1: JSON Extraction - Transform natural language to structured data
- Factor 4: Tools Are Code - Demystify "tool use" as simple JSON routing
- Factor 6: Own Control Flow - Master the prompt-switch-context-loop pattern
📊 State & Context
- Factor 3: Manage Context - Prevent context window explosion
- Factor 7: Stateless Design - Enable pause, resume, and horizontal scaling
- Factor 11: Separate States - Distinguish business from execution state
👥 Human Integration
- Factor 8: Contact Humans - Make human interaction a first-class operation
- Factor 9: Multi-Channel - Meet users where they are (email, Slack, etc.)
🚀 Production Excellence
- Factor 2: Own Your Prompts - Hand-craft every token for production quality
- Factor 5: Micro-Agents - Build small, focused agents (3-10 steps max)
- Factor 10: Error Handling - Process errors intelligently, not blindly
- Factor 12: Bleeding Edge - Find what models almost do well, engineer the rest
Implementation Patterns
The Micro-Agent Architecture
Micro-Agent Composition Pattern
🎯 Orchestrator
Request Router - Directs incoming requests to appropriate micro-agents
🤖 Micro-Agents (3-10 steps each)
- • Intent Classifier (3-5 steps)
- • Data Retriever (4-6 steps)
- • Action Executor (5-8 steps)
- • Response Generator (3-4 steps)
🔧 Deterministic Layer
- • Business Logic
- • Database Operations
- • External APIs
- • Validation Rules
👥 Human Layer (Optional)
Dotted connections indicate on-demand human involvement:
- • Approval Required - When Action Executor needs authorization
- • Expert Input - When Data Retriever encounters complex cases
Flow: Request Router → Intent Classifier → (Data Retriever and/or Action Executor) → Response Generator
Key Benefits:
- Bounded context windows
- Clear failure modes
- Easy to test and debug
- Composable into larger systems
State Management Strategy
Keep them separate for easier debugging, testing, and maintenance.
State Management Flow
Agent State Lifecycle
From Processing Request:
- → Executing Action (when action required)
- ✓ Action Complete → Update Business State → Continue Flow
- 🤝 Human Input Needed → Waiting for Human
- ❌ Error Occurred → Error Handling
Pause/Resume Flow:
- Waiting for Human → Serialize State → Agent Paused
- Agent can be completely shut down
- Hours/Days Later: Resume → Load State → Continue Processing
Error Handling:
- Error → Retry (if under limit) → Back to Processing
- Max Retries Exceeded → Failed State
📊 Execution State
- • Current step: 3
- • Retry count: 1
- • Context tokens: 4521
- • Waiting for: approval
💼 Business State
- • Order status: pending
- • Customer data: updated
- • Approval: received
- • Actions taken: [...]
Integration Architecture
Business Value
Reliability Improvements
- 70% → 95%+ success rates by implementing proper context management
- 10x reduction in debugging time through explicit control flow
- Predictable failure modes instead of mysterious agent behaviors
Maintainability Benefits
- New team members productive in days, not weeks
- Test coverage possible with stateless design
- Version control and rollback for prompts and flows
Team Velocity Gains
- Rapid iteration on prompts without framework fighting
- Parallel development with micro-agent architecture
- Reusable components across different use cases
Action Items for Teams
1. Audit Your Existing Agents
- [ ] Map current architecture to the 12 factors
- [ ] Identify which factors you're violating
- [ ] Prioritize fixes based on pain points
2. Start Small with High-Impact Changes
- [ ] Factor 1: Implement explicit JSON extraction
- [ ] Factor 2: Take ownership of critical prompts
- [ ] Factor 4: Replace "tool use" with explicit routing
3. Refactoring Strategy
4. Measurement Framework
Track before and after:
- Success rates by operation type
- Average context window size
- Time to debug failures
- Developer velocity metrics
5. Building New Agents
For new development:
- Start with JSON extraction, not frameworks
- Build the smallest useful agent first
- Add complexity only when proven necessary
- Design for human collaboration from day one
Common Anti-Patterns to Avoid
❌ The Kitchen Sink Agent
Trying to build one agent that does everything. Instead: compose micro-agents.
❌ Context Window Hoarding
Appending everything to context. Instead: actively manage and summarize.
❌ Framework Lock-in
Depending on framework magic. Instead: own your core abstractions.
❌ Ignoring Humans
Treating human interaction as an edge case. Instead: first-class operation.
❌ State Soup
Mixing execution and business state. Instead: clear separation of concerns.
The Path Forward
The future of agent development isn't more magical frameworks—it's better software engineering applied to LLM capabilities. The teams succeeding today understand this.
Your agents are software. Treat them as such, and they'll reward you with reliability, maintainability, and capabilities your competitors can't match.
Resources for Deep Dives
- Learning Path: "12-Factor Agent Development" (10-hour comprehensive course)
- Blog Post: "The 12-Factor Agent: Building Reliable LLM Applications Without the Magic"
Get Started Today
- Pick your highest-pain agent
- Apply factors 1, 2, and 4
- Measure the improvement
- Share your results with the community
The best time to rethink your agent architecture was when you started. The second best time is now.