Design Principles
Start Simple, Scale Deliberately
Begin with single-agent solutions
Add complexity only when measurements prove it helps
Avoid premature framework adoption
Validate each addition delivers value
Add Memory
Persistence when needed
Add Tools
Capabilities when needed
Multiple Agents
Scale when needed
Squads
Organize when needed
Only progress when current stage is insufficient.
Optimize for Trust, Not Capability
The bottleneck is rarely what agents can do, but whether humans trust what they do.
Transparency - Show reasoning, not just results
Auditability - Log decisions and actions
Predictability - Consistent behavior patterns
Reversibility - Easy to undo mistakes
Agent Design
Prompt Engineering
Do:
# Clear objective
Analyze the API response times and identify bottlenecks.
# Specific constraints
- Focus on p95 latency
- Ignore requests < 100ms
- Output max 5 recommendations
# Structured output
Return JSON: {"bottlenecks": [ ... ], "recommendations": [ ... ]}
Don’t:
Look at the performance stuff and let me know what you think
we should do to make things better. Be thorough!
Single Responsibility
Each agent should do one thing well:
# Bad: Swiss Army Knife Agent
- Researches competitors
- Writes blog posts
- Deploys infrastructure
- Manages database
# Good: Focused Agents
- competitor-researcher → Research output
- content-writer → Blog posts
- deploy-agent → Infrastructure
- db-admin → Database
Fail Gracefully
## Error Handling
If you encounter an error:
1. Log the specific error with context
2. Attempt one retry with adjusted parameters
3. If still failing, report clearly and stop
4. Never silently swallow errors
5. Never retry indefinitely
System Architecture
Memory Hierarchy
Layer Type Purpose Project config Static Project knowledge (CLAUDE.md) Squad memory Persistent Cross-session state Conversation Session Current task context Working memory Ephemeral In-progress data
Communication Patterns
Pattern When to Use Direct handoff Agent A completes, passes to Agent B Shared state file Multiple agents read/write same doc Message queue Async, decoupled agents Orchestrator Central coordinator delegates tasks
Isolation Boundaries
Good Isolation
Agent A → src/auth/ only
Agent B → src/api/ only
Clear boundaries
Poor Isolation
All agents modify all files
No boundaries
Conflicts and overwrites
Quality Assurance
Review Before Merge
Even automated PRs need review:
## PR Checklist
- [ ] Changes match the issue scope
- [ ] No unintended side effects
- [ ] Tests pass (or added)
- [ ] No secrets committed
- [ ] Follows project conventions
Validation Gates
Syntax check
Valid? Continue. Invalid? Reject.
Tests
Pass? Continue. Fail? Reject.
Linter
Clean? Continue. Issues? Auto-fix and retry.
Review
Human approval → Merge
Feedback Loops
# After each significant agent run
squads feedback add engineering
# Track quality over time
squads feedback stats
Operational Excellence
Monitoring
Track these metrics:
Metric Target Red Flag Task completion rate > 90% < 70% Token efficiency > 80% < 50% Error rate < 5% > 15% Avg task duration < 10 min > 30 min
Cost Control
## Budget Rules
- Max $X per agent per day
- Alert at 80% of budget
- Hard stop at 100%
- Weekly cost review
Incident Response
When agents fail:
Stop - Prevent further damage
Assess - What happened, what’s affected
Fix - Resolve immediate issue
Learn - Update prompts/guardrails
Document - Record in squad memory
Anti-Patterns
Avoid These
Over-Engineering
Don’t build orchestration for 2 agents
Don’t add caching until you have latency problems
Don’t create abstractions for one-time tasks
Under-Specification
Don’t say “make it better”
Don’t assume agents know your preferences
Don’t skip output format requirements
Blind Trust
Don’t auto-merge without review
Don’t skip testing “because it’s simple”
Don’t ignore agent errors
Scope Creep
Don’t let agents add features unprompted
Don’t refactor code you didn’t touch
Don’t improve things that aren’t broken
Checklists
New Agent Checklist
Production Readiness
Daily Operations