Deployment Options
| Option | Best For | Complexity |
|---|---|---|
| Local | Development, testing | Low |
| Background agent | Long-running tasks | Low |
| GitHub Actions | CI/CD integration | Medium |
| Vertex AI | Google Cloud native | Medium |
| AWS Bedrock | AWS native | Medium |
| On-premises | Air-gapped, compliance | High |
Local Deployment
Interactive Mode
Run agents directly in your terminal:- Claude Code
- Gemini CLI
- OpenCode
Copy
# Start interactive session
claude
# Run specific agent
squads run engineering/code-reviewer
Copy
# Start interactive session
gemini
# Run with specific task
gemini "Review the code in src/"
Copy
# Start interactive session
opencode
# Run agent
opencode run code-reviewer
Background Agent
Run agents as long-running processes:Copy
# Start agent in background
nohup squads run intelligence/market-monitor --continuous &
# Check running agents
squads status
# View agent logs
tail -f ~/.agents/logs/market-monitor.log
# Stop background agent
squads stop market-monitor
Daemon Mode
Copy
# Start agent daemon
squads daemon start
# Agent daemon runs continuously, executing scheduled tasks
# Configure schedules in .agents/schedules.yml
Copy
# .agents/schedules.yml
schedules:
- agent: intelligence/market-monitor
cron: "0 9 * * *" # Daily at 9 AM
timeout: 30m
- agent: engineering/dependency-checker
cron: "0 0 * * 0" # Weekly on Sunday
timeout: 1h
- agent: customer/lead-scorer
cron: "*/15 * * * *" # Every 15 minutes
timeout: 5m
Plan Optimization
Before deployment, optimize agent prompts and configuration:Prompt Optimization
Copy
# Before: Verbose, unclear
You are a helpful assistant that helps with code review.
Please look at the code and tell me if there are any issues.
Be thorough and check everything.
# After: Specific, structured
## Role
Code reviewer for TypeScript/React projects.
## Task
Review the PR diff for:
1. Security vulnerabilities (injection, XSS)
2. Performance issues (N+1 queries, memory leaks)
3. Code style violations (per .eslintrc)
## Output
JSON array of findings:
{"file": "", "line": 0, "severity": "", "issue": "", "suggestion": ""}
Return empty array if no issues found.
Configuration Optimization
Copy
# .agents/optimization.yml
agents:
code-reviewer:
# Model selection
model: claude-sonnet # Not opus - balance cost/quality
# Context limits
max_context: 50000 # Prevent runaway token usage
max_output: 5000
# Timeouts
timeout: 5m
retry_count: 2
# Caching
cache_responses: true
cache_ttl: 1h
A/B Testing
Copy
# Test two prompt versions
squads run code-reviewer --variant=A --task="Review PR #123"
squads run code-reviewer --variant=B --task="Review PR #123"
# Compare results
squads compare --metric=quality --variants=A,B
GitHub Actions
Basic Workflow
Copy
# .github/workflows/agent-review.yml
name: Agent Code Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install squads CLI
run: npm install -g @agents-squads/cli
- name: Run code review agent
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
squads run engineering/code-reviewer \
--input="$(gh pr diff ${{ github.event.number }})" \
--output=review.json
- name: Post review comment
uses: actions/github-script@v7
with:
script: |
const review = require('./review.json');
if (review.findings.length > 0) {
await github.rest.pulls.createReview({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: context.issue.number,
body: review.summary,
event: 'COMMENT'
});
}
Scheduled Agent
Copy
# .github/workflows/daily-analysis.yml
name: Daily Market Analysis
on:
schedule:
- cron: '0 9 * * *' # 9 AM daily
workflow_dispatch: # Manual trigger
jobs:
analyze:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run market analysis
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
squads run intelligence/market-analyzer \
--output=analysis.md
- name: Create issue with results
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const analysis = fs.readFileSync('analysis.md', 'utf8');
await github.rest.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
title: `Market Analysis - ${new Date().toISOString().split('T')[0]}`,
body: analysis,
labels: ['analysis', 'automated']
});
Matrix Deployment
Copy
# Run multiple agents in parallel
jobs:
agents:
runs-on: ubuntu-latest
strategy:
matrix:
agent:
- engineering/linter
- engineering/security-scanner
- engineering/dependency-checker
steps:
- uses: actions/checkout@v4
- name: Run ${{ matrix.agent }}
run: squads run ${{ matrix.agent }}
Google Cloud (Vertex AI)
Setup
Copy
# Enable APIs
gcloud services enable aiplatform.googleapis.com
gcloud services enable run.googleapis.com
# Create service account
gcloud iam service-accounts create agent-runner \
--display-name="Agent Runner"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:agent-runner@$PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"
Cloud Run Deployment
Copy
# Dockerfile
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
CMD ["npx", "squads", "daemon", "start"]
Copy
# cloudbuild.yaml
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/agent-runner', '.']
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'gcr.io/$PROJECT_ID/agent-runner']
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args:
- 'run'
- 'deploy'
- 'agent-runner'
- '--image=gcr.io/$PROJECT_ID/agent-runner'
- '--region=us-central1'
- '--memory=2Gi'
- '--timeout=3600'
- '--set-secrets=ANTHROPIC_API_KEY=anthropic-key:latest'
Cloud Functions (Event-Driven)
Copy
# main.py
import functions_framework
from anthropic import Anthropic
@functions_framework.cloud_event
def run_agent(cloud_event):
"""Triggered by Pub/Sub message."""
data = cloud_event.data
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=[{"role": "user", "content": data["task"]}]
)
return response.content[0].text
AWS Bedrock
Setup
Copy
# Configure AWS CLI
aws configure
# Enable Bedrock model access (console or CLI)
aws bedrock put-model-invocation-logging-configuration \
--logging-config s3Config={bucketName=agent-logs}
Lambda Deployment
Copy
# lambda_function.py
import boto3
import json
bedrock = boto3.client('bedrock-runtime')
def lambda_handler(event, context):
response = bedrock.invoke_model(
modelId='anthropic.claude-3-sonnet-20240229-v1:0',
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 4096,
"messages": [
{"role": "user", "content": event['task']}
]
})
)
result = json.loads(response['body'].read())
return result['content'][0]['text']
Copy
# template.yaml (SAM)
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
AgentFunction:
Type: AWS::Serverless::Function
Properties:
Handler: lambda_function.lambda_handler
Runtime: python3.11
Timeout: 900
MemorySize: 1024
Policies:
- Statement:
- Effect: Allow
Action:
- bedrock:InvokeModel
Resource: '*'
Step Functions (Orchestration)
Copy
{
"Comment": "Agent Squad Workflow",
"StartAt": "Research",
"States": {
"Research": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:researcher",
"Next": "Analyze"
},
"Analyze": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:analyst",
"Next": "Report"
},
"Report": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:reporter",
"End": true
}
}
}
On-Premises
Air-Gapped Deployment
For environments without internet access:Agents
squads CLI
Local LLM
Ollama / vLLM
Local DB
PostgreSQL
Air-Gapped Architecture
All components run within isolated network. No external API calls.
Self-Hosted LLM
Copy
# docker-compose.yml
version: '3.8'
services:
ollama:
image: ollama/ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
agents:
build: .
environment:
- LLM_BASE_URL=http://ollama:11434
- LLM_MODEL=llama3.1:70b
depends_on:
- ollama
volumes:
ollama_data:
Kubernetes Deployment
Copy
# k8s/agent-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: agent-squad
spec:
replicas: 3
selector:
matchLabels:
app: agent-squad
template:
metadata:
labels:
app: agent-squad
spec:
containers:
- name: agent
image: agents-squads/runner:latest
env:
- name: ANTHROPIC_API_KEY
valueFrom:
secretKeyRef:
name: api-keys
key: anthropic
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
Best Practices
- Start local, graduate to cloud as needs grow
- Use secrets management (never hardcode keys)
- Set timeouts to prevent runaway costs
- Enable logging and monitoring
- Test in staging before production
- Use infrastructure-as-code
- Implement health checks
Common deployment mistakes:
- Hardcoded API keys in code/config
- Missing timeouts (agents run forever)
- No error handling or retries
- Insufficient logging
- Over-provisioned resources (wasted cost)