CLI Delegation
Agents Squads enables multi-LLM support by delegating to native provider CLIs. No SDK integrations needed—just shell out to claude, gemini, codex, grok, etc.
Check Available Providers
Provider CLI Status
────────────────────────────────────────
Anthropic claude ✓ ready
Google gemini ✓ ready
OpenAI codex ✓ ready
Mistral vibe ✓ ready
xAI grok ✓ ready
Aider aider ✓ ready
Ollama ollama ✓ ready
✓ All providers ready
Run with Specific Provider
# Override provider for this run
squads run research/analyst --execute --provider=google
# Uses Gemini CLI instead of Claude
Provider Resolution
The CLI resolves providers in this order:
Agent file - provider: in frontmatter or ## Provider header
CLI flag - --provider=google
Squad default - providers.default in SQUAD.md
Fallback - anthropic
Configuration
Squad-Level Providers
Configure default providers in SQUAD.md frontmatter:
---
name : intelligence
mission : Market research and competitive analysis
providers :
default : anthropic # CLI: claude
vision : openai # CLI: codex (for image analysis)
realtime : xai # CLI: grok (for real-time data)
cheap : google # CLI: gemini (for high-volume)
---
Agent-Level Override
Override the provider for specific agents:
---
provider : xai
---
# Social Monitor
## Purpose
Real-time X/Twitter trend detection using Grok.
Supported CLIs
Provider CLI Install Non-Interactive Flag Anthropic claudenpm i -g @anthropic-ai/claude-code--printGoogle gemininpm i -g @google/gemini-cli--promptOpenAI codexnpm i -g @openai/codexexec subcommandMistral vibecurl -LsSf https://mistral.ai/vibe/install.sh | bash--prompt --auto-approvexAI grokbun add -g @vibe-kit/grok-cli--promptMulti aiderpip install aider-install && aider-install--message --yesLocal ollamabrew install ollamarun <model> "<prompt>"
Environment Variables
Each CLI reads its own API keys:
# .env (or shell profile)
ANTHROPIC_API_KEY = sk-ant-... # claude
GOOGLE_API_KEY = AIza... # gemini (or OAuth)
OPENAI_API_KEY = sk-... # codex
MISTRAL_API_KEY = ... # vibe
XAI_API_KEY = ... # grok
Why Use Multiple LLMs?
Different LLMs excel at different tasks. A well-designed agent system can leverage:
Claude - Complex reasoning, nuanced analysis, long context
GPT-4 - General purpose, wide knowledge, tool use
Gemini - Multimodal, Google ecosystem integration
Grok - Real-time data, X/Twitter integration
Llama/Open models - Privacy, self-hosting, cost control
Provider Comparison
Provider Strengths Best For Claude (Anthropic) Reasoning, safety, long context Complex analysis, code review GPT-4 (OpenAI) Versatility, ecosystem General tasks, plugins Gemini (Google) Multimodal, speed Vision tasks, Google integration Grok (xAI) Real-time, humor Social media, current events Llama (Meta) Open source, self-host Privacy-sensitive, offline Mistral European, efficient EU compliance, edge deployment
Model Tiers (Within Providers)
Each provider offers different capability tiers:
Anthropic (Claude)
OpenAI
Google
xAI
Model Use Case Cost Claude Opus Complex reasoning $$$ Claude Sonnet Balanced default $$ Claude Haiku Fast, simple tasks $
Model Use Case Cost GPT-4o Multimodal, flagship $$$ GPT-4o-mini Fast, efficient $ o1/o3 Deep reasoning $$$$
Model Use Case Cost Gemini Ultra Most capable $$$ Gemini Pro Balanced $$ Gemini Flash Speed optimized $
Model Use Case Cost Grok-2 Full capability $$$ Grok-2-mini Faster responses $$
Squad Configuration
Agent-Level Provider Selection
Assign different providers to different agents:
# SQUAD.md - Intelligence Squad
## Agents
### market-researcher
**Provider** : Claude Sonnet
**Purpose** : Deep market analysis requiring nuanced reasoning
### social-monitor
**Provider** : Grok
**Purpose** : Real-time X/Twitter monitoring and trend detection
### data-analyst
**Provider** : GPT-4o
**Purpose** : Spreadsheet analysis with vision capabilities
### summarizer
**Provider** : Gemini Flash
**Purpose** : Fast summarization of research findings
Environment Configuration
Set up API keys for each provider:
# .env
ANTHROPIC_API_KEY = sk-ant-...
OPENAI_API_KEY = sk-...
GOOGLE_API_KEY = AIza...
XAI_API_KEY = xai-...
Provider Selection in Code
Claude Code
OpenAI
Gemini
Grok
# Agent: Market Researcher
**Model** : claude-sonnet-4-20250514
## Instructions
Analyze market trends using deep reasoning...
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : prompt}]
)
import google.generativeai as genai
genai.configure( api_key = os.environ[ "GOOGLE_API_KEY" ])
model = genai.GenerativeModel( "gemini-1.5-pro" )
response = model.generate_content(prompt)
from openai import OpenAI # Grok uses OpenAI-compatible API
client = OpenAI(
api_key = os.environ[ "XAI_API_KEY" ],
base_url = "https://api.x.ai/v1"
)
response = client.chat.completions.create(
model = "grok-2" ,
messages = [{ "role" : "user" , "content" : prompt}]
)
Routing Patterns
Task-Based Routing
Route tasks to the best provider:
Task Requirement Best Provider Real-time data Grok Image analysis GPT-4o / Gemini Deep reasoning Claude Opus Google integration Gemini General tasks Claude Sonnet / GPT-4o
Cascade Pattern
Start cheap, escalate when needed:
Start cheap
Gemini Flash (fastest, cheapest)
If insufficient
Escalate to Claude Sonnet
If still insufficient
Escalate to Claude Opus
Consensus Pattern
Use multiple providers for critical decisions:
Critical Decision → Run in parallel across Claude, GPT-4o, and Gemini → Voting/Synthesis → Final Answer
Cost Optimization
Price Comparison (approximate per 1M tokens)
Provider Input Output Claude Haiku $0.25 $1.25 GPT-4o-mini $0.15 $0.60 Gemini Flash $0.075 $0.30 Claude Sonnet $3.00 $15.00 GPT-4o $2.50 $10.00 Gemini Pro $1.25 $5.00 Claude Opus $15.00 $75.00
Prices change frequently. Check provider pricing pages for current rates.
Cost Strategy
## Budget Allocation Example
- 60% → Gemini Flash / GPT-4o-mini (high-volume tasks)
- 30% → Claude Sonnet / GPT-4o (standard tasks)
- 10% → Claude Opus / o1 (complex reasoning)
Implementation Examples
Multi-Provider Squad
# .agents/squads/intelligence/SQUAD.md
name : Intelligence Squad
description : Multi-provider research and analysis
agents :
- name : trend-scanner
provider : grok
model : grok-2-mini
purpose : Real-time social trend detection
- name : deep-researcher
provider : anthropic
model : claude-sonnet-4-20250514
purpose : In-depth analysis and synthesis
- name : data-visualizer
provider : openai
model : gpt-4o
purpose : Chart and image generation
- name : fast-summarizer
provider : google
model : gemini-1.5-flash
purpose : Quick summaries and translations
Provider Abstraction
Create a unified interface:
// lib/llm.ts
type Provider = 'anthropic' | 'openai' | 'google' | 'xai' ;
interface LLMConfig {
provider : Provider ;
model : string ;
temperature ?: number ;
}
async function query ( config : LLMConfig , prompt : string ) {
switch ( config . provider ) {
case 'anthropic' :
return queryAnthropic ( config . model , prompt );
case 'openai' :
return queryOpenAI ( config . model , prompt );
case 'google' :
return queryGemini ( config . model , prompt );
case 'xai' :
return queryGrok ( config . model , prompt );
}
}
Best Practices
Match provider strengths to task requirements
Use cheaper models for high-volume, simple tasks
Reserve expensive models for complex reasoning
Implement fallbacks across providers for reliability
Monitor costs per provider weekly
Abstract provider selection for easy switching
Avoid:
Using one provider for everything (miss optimizations)
Ignoring rate limits (each provider has different limits)
Hardcoding provider choice (make it configurable)
Forgetting about latency differences
Token Economics Optimize costs across providers
Agent Parallelization Run multi-provider agents concurrently