Skip to main content

Deployment Options

OptionBest ForComplexity
LocalDevelopment, testingLow
Background agentLong-running tasksLow
GitHub ActionsCI/CD integrationMedium
Vertex AIGoogle Cloud nativeMedium
AWS BedrockAWS nativeMedium
On-premisesAir-gapped, complianceHigh

Local Deployment

Interactive Mode

Run agents directly in your terminal:
# Start interactive session
claude

# Run specific agent
squads run engineering/code-reviewer

Background Agent

Run agents as long-running processes:
# Start agent in background
nohup squads run intelligence/market-monitor --continuous &

# Check running agents
squads status

# View agent logs
tail -f ~/.agents/logs/market-monitor.log

# Stop background agent
squads stop market-monitor

Daemon Mode

# Start agent daemon
squads daemon start

# Agent daemon runs continuously, executing scheduled tasks
# Configure schedules in .agents/schedules.yml
# .agents/schedules.yml
schedules:
  - agent: intelligence/market-monitor
    cron: "0 9 * * *"  # Daily at 9 AM
    timeout: 30m

  - agent: engineering/dependency-checker
    cron: "0 0 * * 0"  # Weekly on Sunday
    timeout: 1h

  - agent: customer/lead-scorer
    cron: "*/15 * * * *"  # Every 15 minutes
    timeout: 5m

Plan Optimization

Before deployment, optimize agent prompts and configuration:

Prompt Optimization

# Before: Verbose, unclear
You are a helpful assistant that helps with code review.
Please look at the code and tell me if there are any issues.
Be thorough and check everything.

# After: Specific, structured
## Role
Code reviewer for TypeScript/React projects.

## Task
Review the PR diff for:
1. Security vulnerabilities (injection, XSS)
2. Performance issues (N+1 queries, memory leaks)
3. Code style violations (per .eslintrc)

## Output
JSON array of findings:
{"file": "", "line": 0, "severity": "", "issue": "", "suggestion": ""}

Return empty array if no issues found.

Configuration Optimization

# .agents/optimization.yml
agents:
  code-reviewer:
    # Model selection
    model: claude-sonnet  # Not opus - balance cost/quality

    # Context limits
    max_context: 50000    # Prevent runaway token usage
    max_output: 5000

    # Timeouts
    timeout: 5m
    retry_count: 2

    # Caching
    cache_responses: true
    cache_ttl: 1h

A/B Testing

# Test two prompt versions
squads run code-reviewer --variant=A --task="Review PR #123"
squads run code-reviewer --variant=B --task="Review PR #123"

# Compare results
squads compare --metric=quality --variants=A,B

GitHub Actions

Basic Workflow

# .github/workflows/agent-review.yml
name: Agent Code Review

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install squads CLI
        run: npm install -g @agents-squads/cli

      - name: Run code review agent
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          squads run engineering/code-reviewer \
            --input="$(gh pr diff ${{ github.event.number }})" \
            --output=review.json

      - name: Post review comment
        uses: actions/github-script@v7
        with:
          script: |
            const review = require('./review.json');
            if (review.findings.length > 0) {
              await github.rest.pulls.createReview({
                owner: context.repo.owner,
                repo: context.repo.repo,
                pull_number: context.issue.number,
                body: review.summary,
                event: 'COMMENT'
              });
            }

Scheduled Agent

# .github/workflows/daily-analysis.yml
name: Daily Market Analysis

on:
  schedule:
    - cron: '0 9 * * *'  # 9 AM daily
  workflow_dispatch:      # Manual trigger

jobs:
  analyze:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run market analysis
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          squads run intelligence/market-analyzer \
            --output=analysis.md

      - name: Create issue with results
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const analysis = fs.readFileSync('analysis.md', 'utf8');
            await github.rest.issues.create({
              owner: context.repo.owner,
              repo: context.repo.repo,
              title: `Market Analysis - ${new Date().toISOString().split('T')[0]}`,
              body: analysis,
              labels: ['analysis', 'automated']
            });

Matrix Deployment

# Run multiple agents in parallel
jobs:
  agents:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        agent:
          - engineering/linter
          - engineering/security-scanner
          - engineering/dependency-checker
    steps:
      - uses: actions/checkout@v4
      - name: Run ${{ matrix.agent }}
        run: squads run ${{ matrix.agent }}

Google Cloud (Vertex AI)

Setup

# Enable APIs
gcloud services enable aiplatform.googleapis.com
gcloud services enable run.googleapis.com

# Create service account
gcloud iam service-accounts create agent-runner \
  --display-name="Agent Runner"

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member="serviceAccount:agent-runner@$PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

Cloud Run Deployment

# Dockerfile
FROM node:20-alpine

WORKDIR /app
COPY package*.json ./
RUN npm install

COPY . .

CMD ["npx", "squads", "daemon", "start"]
# cloudbuild.yaml
steps:
  - name: 'gcr.io/cloud-builders/docker'
    args: ['build', '-t', 'gcr.io/$PROJECT_ID/agent-runner', '.']

  - name: 'gcr.io/cloud-builders/docker'
    args: ['push', 'gcr.io/$PROJECT_ID/agent-runner']

  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    entrypoint: gcloud
    args:
      - 'run'
      - 'deploy'
      - 'agent-runner'
      - '--image=gcr.io/$PROJECT_ID/agent-runner'
      - '--region=us-central1'
      - '--memory=2Gi'
      - '--timeout=3600'
      - '--set-secrets=ANTHROPIC_API_KEY=anthropic-key:latest'

Cloud Functions (Event-Driven)

# main.py
import functions_framework
from anthropic import Anthropic

@functions_framework.cloud_event
def run_agent(cloud_event):
    """Triggered by Pub/Sub message."""
    data = cloud_event.data

    client = Anthropic()
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,
        messages=[{"role": "user", "content": data["task"]}]
    )

    return response.content[0].text

AWS Bedrock

Setup

# Configure AWS CLI
aws configure

# Enable Bedrock model access (console or CLI)
aws bedrock put-model-invocation-logging-configuration \
  --logging-config s3Config={bucketName=agent-logs}

Lambda Deployment

# lambda_function.py
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def lambda_handler(event, context):
    response = bedrock.invoke_model(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        body=json.dumps({
            "anthropic_version": "bedrock-2023-05-31",
            "max_tokens": 4096,
            "messages": [
                {"role": "user", "content": event['task']}
            ]
        })
    )

    result = json.loads(response['body'].read())
    return result['content'][0]['text']
# template.yaml (SAM)
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
  AgentFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: lambda_function.lambda_handler
      Runtime: python3.11
      Timeout: 900
      MemorySize: 1024
      Policies:
        - Statement:
            - Effect: Allow
              Action:
                - bedrock:InvokeModel
              Resource: '*'

Step Functions (Orchestration)

{
  "Comment": "Agent Squad Workflow",
  "StartAt": "Research",
  "States": {
    "Research": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123:function:researcher",
      "Next": "Analyze"
    },
    "Analyze": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123:function:analyst",
      "Next": "Report"
    },
    "Report": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123:function:reporter",
      "End": true
    }
  }
}

On-Premises

Air-Gapped Deployment

For environments without internet access:

Agents

squads CLI

Local LLM

Ollama / vLLM

Local DB

PostgreSQL

Air-Gapped Architecture

All components run within isolated network. No external API calls.

Self-Hosted LLM

# docker-compose.yml
version: '3.8'

services:
  ollama:
    image: ollama/ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama

  agents:
    build: .
    environment:
      - LLM_BASE_URL=http://ollama:11434
      - LLM_MODEL=llama3.1:70b
    depends_on:
      - ollama

volumes:
  ollama_data:

Kubernetes Deployment

# k8s/agent-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: agent-squad
spec:
  replicas: 3
  selector:
    matchLabels:
      app: agent-squad
  template:
    metadata:
      labels:
        app: agent-squad
    spec:
      containers:
        - name: agent
          image: agents-squads/runner:latest
          env:
            - name: ANTHROPIC_API_KEY
              valueFrom:
                secretKeyRef:
                  name: api-keys
                  key: anthropic
          resources:
            requests:
              memory: "1Gi"
              cpu: "500m"
            limits:
              memory: "2Gi"
              cpu: "1000m"

Best Practices

  • Start local, graduate to cloud as needs grow
  • Use secrets management (never hardcode keys)
  • Set timeouts to prevent runaway costs
  • Enable logging and monitoring
  • Test in staging before production
  • Use infrastructure-as-code
  • Implement health checks
Common deployment mistakes:
  • Hardcoded API keys in code/config
  • Missing timeouts (agents run forever)
  • No error handling or retries
  • Insufficient logging
  • Over-provisioned resources (wasted cost)