Documentation Index
Fetch the complete documentation index at: https://docs.agents-squads.com/llms.txt
Use this file to discover all available pages before exploring further.
CLI Delegation
Agents Squads enables multi-LLM support by delegating to native provider CLIs. No SDK integrations needed—just shell out toclaude, gemini, codex, grok, etc.
Check Available Providers
Run with Specific Provider
Provider Resolution
The CLI resolves providers in this order:- Agent file -
provider:in frontmatter or## Providerheader - CLI flag -
--provider=google - Squad default -
providers.defaultin SQUAD.md - Fallback -
anthropic
Configuration
Squad-Level Providers
Configure default providers in SQUAD.md frontmatter:Agent-Level Override
Override the provider for specific agents:- Frontmatter
- Header
Supported CLIs
| Provider | CLI | Install | Non-Interactive Flag |
|---|---|---|---|
| Anthropic | claude | npm i -g @anthropic-ai/claude-code | --print |
gemini | npm i -g @google/gemini-cli | --prompt | |
| OpenAI | codex | npm i -g @openai/codex | exec subcommand |
| Mistral | vibe | curl -LsSf https://mistral.ai/vibe/install.sh | bash | --prompt --auto-approve |
| xAI | grok | bun add -g @vibe-kit/grok-cli | --prompt |
| Multi | aider | pip install aider-install && aider-install | --message --yes |
| Local | ollama | brew install ollama | run <model> "<prompt>" |
Environment Variables
Each CLI reads its own API keys:Why Use Multiple LLMs?
Different LLMs excel at different tasks. A well-designed agent system can leverage:- Claude - Complex reasoning, nuanced analysis, long context
- GPT-4 - General purpose, wide knowledge, tool use
- Gemini - Multimodal, Google ecosystem integration
- Grok - Real-time data, X/Twitter integration
- Llama/Open models - Privacy, self-hosting, cost control
Provider Comparison
| Provider | Strengths | Best For |
|---|---|---|
| Claude (Anthropic) | Reasoning, safety, long context | Complex analysis, code review |
| GPT-4 (OpenAI) | Versatility, ecosystem | General tasks, plugins |
| Gemini (Google) | Multimodal, speed | Vision tasks, Google integration |
| Grok (xAI) | Real-time, humor | Social media, current events |
| Llama (Meta) | Open source, self-host | Privacy-sensitive, offline |
| Mistral | European, efficient | EU compliance, edge deployment |
Model Tiers (Within Providers)
Each provider offers different capability tiers:- Anthropic (Claude)
- OpenAI
- Google
- xAI
| Model | Use Case | Cost |
|---|---|---|
| Claude Opus | Complex reasoning | $$$ |
| Claude Sonnet | Balanced default | $$ |
| Claude Haiku | Fast, simple tasks | $ |
Squad Configuration
Agent-Level Provider Selection
Assign different providers to different agents:Environment Configuration
Set up API keys for each provider:Provider Selection in Code
- Claude Code
- OpenAI
- Gemini
- Grok
Routing Patterns
Task-Based Routing
Route tasks to the best provider:| Task Requirement | Best Provider |
|---|---|
| Real-time data | Grok |
| Image analysis | GPT-4o / Gemini |
| Deep reasoning | Claude Opus |
| Google integration | Gemini |
| General tasks | Claude Sonnet / GPT-4o |
Cascade Pattern
Start cheap, escalate when needed:Consensus Pattern
Use multiple providers for critical decisions:Critical Decision → Run in parallel across Claude, GPT-4o, and Gemini → Voting/Synthesis → Final Answer
Cost Optimization
Price Comparison (approximate per 1M tokens)
| Provider | Input | Output |
|---|---|---|
| Claude Haiku | $0.25 | $1.25 |
| GPT-4o-mini | $0.15 | $0.60 |
| Gemini Flash | $0.075 | $0.30 |
| Claude Sonnet | $3.00 | $15.00 |
| GPT-4o | $2.50 | $10.00 |
| Gemini Pro | $1.25 | $5.00 |
| Claude Opus | $15.00 | $75.00 |
Prices change frequently. Check provider pricing pages for current rates.
Cost Strategy
Implementation Examples
Multi-Provider Squad
Provider Abstraction
Create a unified interface:Best Practices
- Match provider strengths to task requirements
- Use cheaper models for high-volume, simple tasks
- Reserve expensive models for complex reasoning
- Implement fallbacks across providers for reliability
- Monitor costs per provider weekly
- Abstract provider selection for easy switching
Related
Token Economics
Optimize costs across providers
Agent Parallelization
Run multi-provider agents concurrently