| Tier 1 — FrontierComplex reasoning · Strategy · Planning · External dev only |
| Claude Opus 4.6Anthropic · Feb 2026 | $5 in · $25 out | 1M | — | Chain-of-thought, expert level | | Complex terminal coding, multi-step planning | Highest-tier reasoning for the hardest agent plans and code paths. |
| GPT-5.4OpenAI · Mar 2026 | $2.50 in · $12.50 out | 1.28M | — | Dynamic MCT, superhuman desktop control | | Autonomous execution, high-level agency | Built for end-to-end autonomy with strong tool and desktop control. |
| GLM-5.1Zhipu AI · Apr 2026 | $1.40 in · $4.40 out | 200K | — | 7b–60 total / 400 active MoE, Huawei chips | | Long-horizon agentic routing | MoE scale plus hardware-aware routing for sustained agent runs. |
| Tier 2 — ExecutionAgent execution · Tool calls · Long task chains · Multi-step pipelines |
| MiniMax N2.7MiniMax | $0.30 in · $1.20 out | 200K | — | Self-evolving CoT, multi-agent loops | | OpenClaw execution backbone | Reliable execution layer for chained tools and agent loops. |
| Kimi K2.5Moonshot | $0.60 in · $3.00 out | 256K | — | 31 experts, 384 active, parallel agentic vision | | Multi-source browsing | Wide context and parallel vision for research-heavy agents. |
| Grok 4.20xAI | $2.00 in · $6.00 out | 2M | — | 8-agent parallel system, real-time X data | | Real-time research | Massive context plus live signal for attention-sensitive research. |
| DeepSeek V3.2DeepSeek | $0.27 in · $0.41 out | 164K | — | Multi-head latent attention, MLA optimized | | Open-source power-user | Efficient attention stack for heavy execution without frontier cost. |
| Tier 3 — BalancedContext · Code · Research · Day-to-day tasks |
| Claude Sonnet 4.6Anthropic | $3 in · $15 out | 1M | — | Adaptive thinking, 40–60 active | | Daily coding, content automation | Default “always on” balance of quality, speed, and cost. |
| GPT-5.4 miniOpenAI | $0.15 in · $4.50 out | 400K | — | Native vision, sub-agent optimized | | High-speed chat, layer-2 chains | Fast passes and sub-agents where full GPT-5.4 is overkill. |
| Gemini 1.1 ProGoogle | $2 in · $12 out | 1M | — | Native multimodal, video-audio-action | | Multi-modal agents, video/audio analysis | First-class media understanding for multimodal agent stacks. |
| Qwen 3.6 PlusAlibaba · OpenRouter | $0 in · $0 out | 1M | via OpenRouter | Hybrid MoE, 3.5 on steroids | | Agent routing, Tier 3 tasking | Free-tier routing workhorse with strong MoE throughput. |
| Llama 4 MaverickMeta | $0.15 – $0.45 | 1M | provider-dependent | 400B total, 1.2T parameters | | Self-hosted Tier 3 | On-prem option that still feels like a mid-tier frontier model. |
| Mistral Small 4Mistral | $0.15 in · $0.60 out | 256K | — | Apache 2.0, reasoning-gated | | Modern commerce, scaling | License-friendly, low-latency scaling for product workloads. |
| Tier 4 — Local / MicroSummaries · Routing · Classification · Always-on loops · $0 cost |
| Qwen 3.6-8BLocal | $0.00 | 252K | Local | Thinking toggle, multimodal | | Summarization, routing | Tiny footprint for 24/7 summarization and intent routing. |
| Qwen 3.6-27BLocal | $0.00 | 252K | Local | 32B dense, 255 languages | | Local reasoning, micro-classification | Step up in logic depth while staying entirely on-device. |
| Gemma 4 (31B)Google · local | $0.00 | 256K | Local | 31B dense, Gemini 2.0 — QFT quantized | | Local agentic sub-tasks | Gemini-family behavior in a compact, quant-friendly package. |
| DeepSeek R1 DistillDeepSeek · local | $0.00 | 128K | Local | 32B dense distilled from R1 | | Reasoning-heavy, logic-based tasks | Distilled reasoning traces without calling the full R1 endpoint. |
| GLM-4.5-AirZhipu · SiliconFlow | Low | 128K | via SiliconFlow | Multi-purpose, agent-focused | | Lightweight agentic sub-tasks | Near-free edge tier for browser helpers and micro-tools. |