Skip to main content

Design Principles

Start Simple, Scale Deliberately

  • Begin with single-agent solutions
  • Add complexity only when measurements prove it helps
  • Avoid premature framework adoption
  • Validate each addition delivers value
1

Single Agent

Start simple
2

Add Memory

Persistence when needed
3

Add Tools

Capabilities when needed
4

Multiple Agents

Scale when needed
5

Squads

Organize when needed
Only progress when current stage is insufficient.

Optimize for Trust, Not Capability

The bottleneck is rarely what agents can do, but whether humans trust what they do.
  • Transparency - Show reasoning, not just results
  • Auditability - Log decisions and actions
  • Predictability - Consistent behavior patterns
  • Reversibility - Easy to undo mistakes

Agent Design

Prompt Engineering

Do:
# Clear objective
Analyze the API response times and identify bottlenecks.

# Specific constraints
- Focus on p95 latency
- Ignore requests < 100ms
- Output max 5 recommendations

# Structured output
Return JSON: {"bottlenecks": [...], "recommendations": [...]}
Don’t:
Look at the performance stuff and let me know what you think
we should do to make things better. Be thorough!

Single Responsibility

Each agent should do one thing well:
# Bad: Swiss Army Knife Agent
- Researches competitors
- Writes blog posts
- Deploys infrastructure
- Manages database

# Good: Focused Agents
- competitor-researcher → Research output
- content-writer → Blog posts
- deploy-agent → Infrastructure
- db-admin → Database

Fail Gracefully

## Error Handling

If you encounter an error:
1. Log the specific error with context
2. Attempt one retry with adjusted parameters
3. If still failing, report clearly and stop
4. Never silently swallow errors
5. Never retry indefinitely

System Architecture

Memory Hierarchy

LayerTypePurpose
Project configStaticProject knowledge (CLAUDE.md)
Squad memoryPersistentCross-session state
ConversationSessionCurrent task context
Working memoryEphemeralIn-progress data

Communication Patterns

PatternWhen to Use
Direct handoffAgent A completes, passes to Agent B
Shared state fileMultiple agents read/write same doc
Message queueAsync, decoupled agents
OrchestratorCentral coordinator delegates tasks

Isolation Boundaries

Good Isolation

  • Agent A → src/auth/ only
  • Agent B → src/api/ only
  • Clear boundaries

Poor Isolation

  • All agents modify all files
  • No boundaries
  • Conflicts and overwrites

Quality Assurance

Review Before Merge

Even automated PRs need review:
## PR Checklist
- [ ] Changes match the issue scope
- [ ] No unintended side effects
- [ ] Tests pass (or added)
- [ ] No secrets committed
- [ ] Follows project conventions

Validation Gates

1

Syntax check

Valid? Continue. Invalid? Reject.
2

Tests

Pass? Continue. Fail? Reject.
3

Linter

Clean? Continue. Issues? Auto-fix and retry.
4

Review

Human approval → Merge

Feedback Loops

# After each significant agent run
squads feedback add engineering

# Track quality over time
squads feedback stats

Operational Excellence

Monitoring

Track these metrics:
MetricTargetRed Flag
Task completion rate> 90%< 70%
Token efficiency> 80%< 50%
Error rate< 5%> 15%
Avg task duration< 10 min> 30 min

Cost Control

## Budget Rules
- Max $X per agent per day
- Alert at 80% of budget
- Hard stop at 100%
- Weekly cost review

Incident Response

When agents fail:
  1. Stop - Prevent further damage
  2. Assess - What happened, what’s affected
  3. Fix - Resolve immediate issue
  4. Learn - Update prompts/guardrails
  5. Document - Record in squad memory

Anti-Patterns

Avoid These

Over-Engineering
  • Don’t build orchestration for 2 agents
  • Don’t add caching until you have latency problems
  • Don’t create abstractions for one-time tasks
Under-Specification
  • Don’t say “make it better”
  • Don’t assume agents know your preferences
  • Don’t skip output format requirements
Blind Trust
  • Don’t auto-merge without review
  • Don’t skip testing “because it’s simple”
  • Don’t ignore agent errors
Scope Creep
  • Don’t let agents add features unprompted
  • Don’t refactor code you didn’t touch
  • Don’t improve things that aren’t broken

Checklists

New Agent Checklist

  • Clear, single-purpose objective
  • Specific constraints and boundaries
  • Defined output format
  • Error handling instructions
  • Anti-slop rules included
  • Tested on representative inputs

Production Readiness

  • Monitoring in place
  • Budget limits configured
  • Error alerting enabled
  • Rollback plan documented
  • Feedback loop established
  • Review process defined

Daily Operations

  • Check squads status for issues
  • Review any failed tasks
  • Monitor cost trends
  • Update memory with learnings
  • Clear completed todos