Deployment

Deployment Options

Option	Best For	Complexity
Local	Development, testing	Low
Background agent	Long-running tasks	Low
GitHub Actions	CI/CD integration	Medium
Vertex AI	Google Cloud native	Medium
AWS Bedrock	AWS native	Medium
On-premises	Air-gapped, compliance	High

Local Deployment

Interactive Mode

Run agents directly in your terminal:

Claude Code
Gemini CLI
OpenCode

# Start interactive session
claude

# Run specific agent
squads run engineering/code-reviewer

# Start interactive session
gemini

# Run with specific task
gemini "Review the code in src/"

# Start interactive session
opencode

# Run agent
opencode run code-reviewer

Background Agent

Run agents as long-running processes:

# Start agent in background
nohup squads run intelligence/market-monitor --continuous &

# Check running agents
squads status

# View agent logs
tail -f ~/.agents/logs/market-monitor.log

# Stop background agent
squads stop market-monitor

Daemon Mode

# Start agent daemon
squads daemon start

# Agent daemon runs continuously, executing scheduled tasks
# Configure schedules in .agents/schedules.yml

# .agents/schedules.yml
schedules:
  - agent: intelligence/market-monitor
    cron: "0 9 * * *"  # Daily at 9 AM
    timeout: 30m

  - agent: engineering/dependency-checker
    cron: "0 0 * * 0"  # Weekly on Sunday
    timeout: 1h

  - agent: customer/lead-scorer
    cron: "*/15 * * * *"  # Every 15 minutes
    timeout: 5m

Plan Optimization

Before deployment, optimize agent prompts and configuration:

Prompt Optimization

# Before: Verbose, unclear
You are a helpful assistant that helps with code review.
Please look at the code and tell me if there are any issues.
Be thorough and check everything.

# After: Specific, structured
## Role
Code reviewer for TypeScript/React projects.

## Task
Review the PR diff for:
1. Security vulnerabilities (injection, XSS)
2. Performance issues (N+1 queries, memory leaks)
3. Code style violations (per .eslintrc)

## Output
JSON array of findings:
{"file": "", "line": 0, "severity": "", "issue": "", "suggestion": ""}

Return empty array if no issues found.

Configuration Optimization

# .agents/optimization.yml
agents:
  code-reviewer:
    # Model selection
    model: claude-sonnet  # Not opus - balance cost/quality

    # Context limits
    max_context: 50000    # Prevent runaway token usage
    max_output: 5000

    # Timeouts
    timeout: 5m
    retry_count: 2

    # Caching
    cache_responses: true
    cache_ttl: 1h

A/B Testing

# Test two prompt versions
squads run code-reviewer --variant=A --task="Review PR #123"
squads run code-reviewer --variant=B --task="Review PR #123"

# Compare results
squads compare --metric=quality --variants=A,B

GitHub Actions

Basic Workflow

# .github/workflows/agent-review.yml
name: Agent Code Review

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install squads CLI
        run: npm install -g @agents-squads/cli

      - name: Run code review agent
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          squads run engineering/code-reviewer \
            --input="$(gh pr diff ${{ github.event.number }})" \
            --output=review.json

      - name: Post review comment
        uses: actions/github-script@v7
        with:
          script: |
            const review = require('./review.json');
            if (review.findings.length > 0) {
              await github.rest.pulls.createReview({
                owner: context.repo.owner,
                repo: context.repo.repo,
                pull_number: context.issue.number,
                body: review.summary,
                event: 'COMMENT'
              });
            }

Scheduled Agent

# .github/workflows/daily-analysis.yml
name: Daily Market Analysis

on:
  schedule:
    - cron: '0 9 * * *'  # 9 AM daily
  workflow_dispatch:      # Manual trigger

jobs:
  analyze:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run market analysis
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          squads run intelligence/market-analyzer \
            --output=analysis.md

      - name: Create issue with results
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const analysis = fs.readFileSync('analysis.md', 'utf8');
            await github.rest.issues.create({
              owner: context.repo.owner,
              repo: context.repo.repo,
              title: `Market Analysis - ${new Date().toISOString().split('T')[0]}`,
              body: analysis,
              labels: ['analysis', 'automated']
            });

Matrix Deployment

# Run multiple agents in parallel
jobs:
  agents:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        agent:
          - engineering/linter
          - engineering/security-scanner
          - engineering/dependency-checker
    steps:
      - uses: actions/checkout@v4
      - name: Run ${{ matrix.agent }}
        run: squads run ${{ matrix.agent }}

Google Cloud (Vertex AI)

Setup

# Enable APIs
gcloud services enable aiplatform.googleapis.com
gcloud services enable run.googleapis.com

# Create service account
gcloud iam service-accounts create agent-runner \
  --display-name="Agent Runner"

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member="serviceAccount:agent-runner@$PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

Cloud Run Deployment

# Dockerfile
FROM node:20-alpine

WORKDIR /app
COPY package*.json ./
RUN npm install

COPY . .

CMD ["npx", "squads", "daemon", "start"]

# cloudbuild.yaml
steps:
  - name: 'gcr.io/cloud-builders/docker'
    args: ['build', '-t', 'gcr.io/$PROJECT_ID/agent-runner', '.']

  - name: 'gcr.io/cloud-builders/docker'
    args: ['push', 'gcr.io/$PROJECT_ID/agent-runner']

  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    entrypoint: gcloud
    args:
      - 'run'
      - 'deploy'
      - 'agent-runner'
      - '--image=gcr.io/$PROJECT_ID/agent-runner'
      - '--region=us-central1'
      - '--memory=2Gi'
      - '--timeout=3600'
      - '--set-secrets=ANTHROPIC_API_KEY=anthropic-key:latest'

Cloud Functions (Event-Driven)

# main.py
import functions_framework
from anthropic import Anthropic

@functions_framework.cloud_event
def run_agent(cloud_event):
    """Triggered by Pub/Sub message."""
    data = cloud_event.data

    client = Anthropic()
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=4096,
        messages=[{"role": "user", "content": data["task"]}]
    )

    return response.content[0].text

AWS Bedrock

Setup

# Configure AWS CLI
aws configure

# Enable Bedrock model access (console or CLI)
aws bedrock put-model-invocation-logging-configuration \
  --logging-config s3Config={bucketName=agent-logs}

Lambda Deployment

# lambda_function.py
import boto3
import json

bedrock = boto3.client('bedrock-runtime')

def lambda_handler(event, context):
    response = bedrock.invoke_model(
        modelId='anthropic.claude-3-sonnet-20240229-v1:0',
        body=json.dumps({
            "anthropic_version": "bedrock-2023-05-31",
            "max_tokens": 4096,
            "messages": [
                {"role": "user", "content": event['task']}
            ]
        })
    )

    result = json.loads(response['body'].read())
    return result['content'][0]['text']

# template.yaml (SAM)
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31

Resources:
  AgentFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: lambda_function.lambda_handler
      Runtime: python3.11
      Timeout: 900
      MemorySize: 1024
      Policies:
        - Statement:
            - Effect: Allow
              Action:
                - bedrock:InvokeModel
              Resource: '*'

Step Functions (Orchestration)

{
  "Comment": "Agent Squad Workflow",
  "StartAt": "Research",
  "States": {
    "Research": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123:function:researcher",
      "Next": "Analyze"
    },
    "Analyze": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123:function:analyst",
      "Next": "Report"
    },
    "Report": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123:function:reporter",
      "End": true
    }
  }
}

On-Premises

Air-Gapped Deployment

For environments without internet access:

Agents

squads CLI

Local LLM

Ollama / vLLM

Local DB

PostgreSQL

All components run within isolated network. No external API calls.

Self-Hosted LLM

# docker-compose.yml
version: '3.8'

services:
  ollama:
    image: ollama/ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama

  agents:
    build: .
    environment:
      - LLM_BASE_URL=http://ollama:11434
      - LLM_MODEL=llama3.1:70b
    depends_on:
      - ollama

volumes:
  ollama_data:

Kubernetes Deployment

# k8s/agent-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: agent-squad
spec:
  replicas: 3
  selector:
    matchLabels:
      app: agent-squad
  template:
    metadata:
      labels:
        app: agent-squad
    spec:
      containers:
        - name: agent
          image: agents-squads/runner:latest
          env:
            - name: ANTHROPIC_API_KEY
              valueFrom:
                secretKeyRef:
                  name: api-keys
                  key: anthropic
          resources:
            requests:
              memory: "1Gi"
              cpu: "500m"
            limits:
              memory: "2Gi"
              cpu: "1000m"

Best Practices

Start local, graduate to cloud as needs grow
Use secrets management (never hardcode keys)
Set timeouts to prevent runaway costs
Enable logging and monitoring
Test in staging before production
Use infrastructure-as-code
Implement health checks

Common deployment mistakes:

Hardcoded API keys in code/config
Missing timeouts (agents run forever)
No error handling or retries
Insufficient logging
Over-provisioned resources (wasted cost)

Security

Secure your deployments

Agent Performance

Monitor production agents

Get Started

Core Concepts

Configuration

Building Agents

Governance

Production

Resources

API

Deployment Options

Local Deployment

Interactive Mode

Background Agent

Daemon Mode

Plan Optimization

Prompt Optimization

Configuration Optimization

A/B Testing

GitHub Actions

Basic Workflow

Scheduled Agent

Matrix Deployment

Google Cloud (Vertex AI)

Setup

Cloud Run Deployment

Cloud Functions (Event-Driven)

AWS Bedrock

Setup

Lambda Deployment

Step Functions (Orchestration)

On-Premises

Air-Gapped Deployment

Agents

Local LLM

Local DB

Self-Hosted LLM

Kubernetes Deployment

Best Practices

Security

Agent Performance

Get Started

Core Concepts

Configuration

Building Agents

Governance

Production

Resources

API

​Deployment Options

​Local Deployment

​Interactive Mode

​Background Agent

​Daemon Mode

​Plan Optimization

​Prompt Optimization

​Configuration Optimization

​A/B Testing

​GitHub Actions

​Basic Workflow

​Scheduled Agent

​Matrix Deployment

​Google Cloud (Vertex AI)

​Setup

​Cloud Run Deployment

​Cloud Functions (Event-Driven)

​AWS Bedrock

​Setup

​Lambda Deployment

​Step Functions (Orchestration)

​On-Premises

​Air-Gapped Deployment

Agents

Local LLM

Local DB

​Self-Hosted LLM

​Kubernetes Deployment

​Best Practices

​Related

Security

Agent Performance

Deployment Options

Local Deployment

Interactive Mode

Background Agent

Daemon Mode

Plan Optimization

Prompt Optimization

Configuration Optimization

A/B Testing

GitHub Actions

Basic Workflow

Scheduled Agent

Matrix Deployment

Google Cloud (Vertex AI)

Setup

Cloud Run Deployment

Cloud Functions (Event-Driven)

AWS Bedrock

Setup

Lambda Deployment

Step Functions (Orchestration)

On-Premises

Air-Gapped Deployment

Self-Hosted LLM

Kubernetes Deployment

Best Practices

Related