Models Reference

Last updated: 2026-05-20 | Auto-synced daily

API pricing, context windows, and SWE-Bench scores for coding AI models.
Compiled May 2026.

SWE-Bench Verified vs Input Price (May 2026) Source: marc0.dev leaderboard · Updated May 20 2026 60% 65% 70% 75% 80% 85% 90% 0.25 0.5 1.0 2.0 4.0 SWE-Bench Verified Input Price per 1M tokens (log₂ scale) OpenAI Anthropic Google DeepSeek Others DeepSeek V4 Flash MiniMax M2.5 DeepSeek V3.2 Qwen3.6 Plus Kimi K2.5 GLM-4.7 Kimi K2.6 GLM-5 MiMo-V2-Pro DeepSeek V4 Pro Gemini 3.1 Pro Claude Sonnet 4.6 GPT-5.4 Claude Opus 4.5 Claude Opus 4.6 Claude Opus 4.7 GPT-5.5 SWE-Bench Pro vs Input Price (May 2026) Harder benchmark — tests multi-language, multi-step repo tasks · Source: marc0.dev + Scale SEAL · Updated May 20 2026 40% 45% 50% 55% 60% 65% 70% 1.0 2.0 4.0 SWE-Bench Pro Input Price per 1M tokens (log₂ scale) OpenAI Anthropic Google Z.ai Gemini 3.1 Pro Claude Opus 4.5 Claude Opus 4.6 GPT-5.4 GLM-5.1 Claude Opus 4.7 GPT-5.3 Codex GPT-5.5 Kimi K2.6

Anthropic (Claude)

Current as of May 2026. Source: platform.claude.com

Model Input /1M Output /1M Batch (50% off) Cache Writes (5m) Cache Hits
Opus 4.7 $5.00 $25.00 $6.25/MTok $0.50/MTok
Opus 4.6 $5.00 $25.00 $6.25/MTok $0.50/MTok
Opus 4.5 $5.00 $25.00 $6.25/MTok $0.50/MTok
Sonnet 4.6 $3.00 $15.00 $3.75/MTok $0.30/MTok
Haiku 4.5 $1.00 $5.00 $1.25/MTok $0.10/MTok

Opus 4.7: 87.6% SWE-Bench Verified (#2). Opus 4.5: 80.9%. Opus 4.6: 80.8%. Sonnet 4.6: 79.6%.


Google Gemini

Current as of May 2026. Source: ai.google.dev, SWE-Bench

Current Models

Model Input /1M Output /1M Context Max Output SWE-Bench Verified Notes
Gemini 3.1 Pro Preview $2.00 ($4.00 >200K) $12.00 ($18.00 >200K) 2M 16K 80.6% Preview. Top-tier reasoning. 2M ctx
Gemini 3.1 Flash-Lite Preview $0.25 $1.50 1M 64K Fast, high-volume agentic tasks
Gemini 2.5 Pro $1.25 ($2.50 >200K) $10.00 ($15.00 >200K) 2M 64K Complex reasoning, coding, long docs
Gemini 2.5 Flash $0.30 $2.50 1M 64K Balanced cost and capability
Gemini 2.5 Flash-Lite $0.10 $0.40 1M 64K Lowest-cost current Gemini route

Batch / Flex Pricing (50% off)

Model Batch Input /1M Batch Output /1M
Gemini 3.1 Pro (≤200K) $1.00 $6.00
Gemini 3.1 Flash-Lite $0.125 $0.75
Gemini 2.5 Pro (≤200K) $0.625 $5.00
Gemini 2.5 Flash $0.15 $1.25
Gemini 2.5 Flash-Lite $0.05 $0.20

Deprecated

Model Input /1M Output /1M Note
Gemini 2.0 Flash $0.10 $0.40 Shutdown Jun 1 2026

Gemini 3.1 Pro is a preview model (restrictive rate limits). Free tier available for development and small projects. Gemini 3.1 Pro scores 80.6% on SWE-Bench Verified — competitive with Claude Opus 4.6 (80.8%) and DeepSeek V4 Flash (79%).


DeepSeek

Current as of May 2026. Source: api-docs.deepseek.com

DeepSeek V4 is the current flagship, launched March 2026. 671B total params, 37B active MoE, 1M context. SWE-Bench Verified: V4 Pro Max / V4 Pro 80.6%, V4 Flash 79%. V4 Flash is the default workhorse; V4 Pro is premium (75% off until May 31 2026).

New: DeepSeek V4 Pro Max

Released Apr 24 2026. 1.6T params, 49B active MoE, 1M context, open-weight on HuggingFace. 80.6% SWE-Bench Verified. Available at V4 Pro pricing (same API endpoint).

Model Cache Hit Input /1M Cache Miss Input /1M Output /1M Context Notes
deepseek-v4-flash $0.0028 $0.14 $0.28 1M Default route. 384K max output
deepseek-v4-pro (promo) $0.003625 $0.435 $0.87 1M 75% off until May 31 2026 15:59 UTC
deepseek-v4-pro (full) $0.0145 $1.74 $3.48 1M Full price after promo ends

Cache hit prices reduced to 1/10 of launch price from Apr 26 2026. Older aliases deepseek-chat and deepseek-reasoner map to V4 Flash (non-thinking / thinking) and retire after Jul 24 2026. New accounts get 5M free tokens.

Legacy Models

Model Input /1M Output /1M Cache Hit Context Notes
DeepSeek V3.2 (Chat) $0.28 $0.42 $0.028 128K Previous gen, still available
DeepSeek R1 $0.55 $2.19 $0.14 128K Dedicated reasoning model

DeepSeek V3.2: 73.0% SWE-Bench Verified. R1: chain-of-thought reasoning, ~96% cheaper than OpenAI o1. DeepSeek web chat at chat.deepseek.com is free for individual users.


OpenAI (ChatGPT)

Current as of May 2026. Source: openai.com/api/pricing

GPT-5 Family (Current Flagship)

Model Input /1M Output /1M Cached Input Context Notes
GPT-5.5 (≤272K) $5.00 $30.00 $0.50 1M 88.7% SWE-Bench (#1), 58.6% SWE-Bench Pro. Flagship reasoning + coding
GPT-5.5 (>272K) $10.00 $45.00 $1.00 1M Long context tier >272K tokens
GPT-5.5 Pro $30.00 $180.00 1M Premium tier for research-grade problems
GPT-5.4 (≤272K) $2.50 $15.00 $0.25 1M ~80% SWE-Bench Verified. 59.1% SWE-Bench Pro
GPT-5.4 (>272K) $5.00 $22.50 $0.50 1M Long context tier >272K tokens
GPT-5.4 Mini $0.75 $4.50 $0.075 400K Affordable reasoning. Supports reasoning effort control
GPT-5.4 Nano $0.20 $1.25 400K Fastest, cheapest 5.4 tier. Ideal for summaries, classification
GPT-5.3 Codex $1.75 $14.00 400K 85.0% SWE-Bench Verified (#3). 56.8% SWE-Bench Pro. Coding specialist

GPT-4.1 Family (Production Workhorse)

Model Input /1M Output /1M Cached Input Context Notes
GPT-4.1 $2.00 $8.00 $0.50 1M Recommended production model. Strong coding + long context
GPT-4.1 Mini $0.40 $1.60 $0.10 1M Good balance of power and affordability
GPT-4.1 Nano $0.10 $1.40 1M Cheapest OpenAI model. Classification, extraction, routing

o-Series (Reasoning Models)

Model Input /1M Output /1M Cached Input Context Notes
o4-mini $1.10 $4.40 $0.275 200K Best-value reasoning. Math, science, complex logic
o3 $2.00 $8.00 Flagship reasoning. Chain-of-thought built in

Batch API saves 50% on all models. Prompt caching discounts: up to 90% off (GPT-5.5), 75% off (GPT-4.1). GPT-5.5 scores 88.7% SWE-Bench Verified and 58.6% SWE-Bench Pro. GPT-5.4 scores ~80% SWE-Bench Verified and 59.1% SWE-Bench Pro. GPT-5.5 Pro tier ($30/$180) is available for research-grade problems. GPT-4.1 is OpenAI’s recommended production default for most workloads.


MiniMax

Current as of May 2026. Source: platform.minimax.io, OpenRouter

Coding Models

Model Input /1M Output /1M Context Max Output SWE-Bench Speed
M2.7 $0.279 $1.20 205K 131K Released Mar 18 2026
M2.5 Standard $0.15 $1.20 256K 80.2% ~50 TPS
M2.5 Lightning $0.30 $2.40 256K 80.2% ~100 TPS

M2.5 Standard: One of the best value coding models. Automatic cache (no config needed). Near Claude Opus 4.6 (80.8%). OpenCode Go estimates: M2.5 ~6,300 req/5h, M2.7 ~3,400 req/5h.

Subscription Plans

Plan Price Description
Token Plan Subscription Quotas for individual builders and Teams
Credits Prepaid Same resource coverage as Token Plan
Pay-as-you-go Per-token Standard API endpoint billing

Qwen (Alibaba)

Current as of May 2026. Source: DashScope direct pricing

Current Gen (Qwen3.6)

Model Input /1M Output /1M Context SWE-Bench Notes
Qwen3.6 Plus $0.325 $1.95 1M 78.8% Verified Apr 2 2026. Hybrid attention + MoE. Reasoning by default
Qwen3.6 Flash $0.25 $1.50 1M Cost-optimized tier
Qwen3.6 Max Preview $1.30 $7.80 256K SWE-Bench Pro #1 Apr 20 2026. Closed-weights flagship. Leads SWE-Bench Pro, Terminal-Bench 2.0, SkillsBench, SciCode

Qwen3.6 Plus: within 2 points of Claude Opus 4.6 (80.8%) at 1/30th the input price. 1M native context, 65K max output. Reasoning enabled by default (no mode toggle). Qwen3.6-27B (dense, Apache 2.0): 77.2% SWE-Bench Verified — strong self-hosting option. Qwen3.6-Max-Preview (Apr 20 2026): First closed-weights Qwen flagship. $1.30/$7.80 per MTok. 256K context. Tops SWE-Bench Pro + 5 other coding benchmarks at launch.

Previous Gen (Qwen3.5)

Model Input /1M Output /1M Context Notes
Qwen3.5 Plus $0.26 $1.56 1M Feb 2026 release. 65K max output
Qwen3.5 397B A17B Free Free 262K Open-weight MoE flagship

Qwen-Max (Legacy Flagship)

Model Input /1M Output /1M Context
qwen3-max (0-32K) $1.20 $6.00 252K
qwen3-max (32K-128K) $2.40 $12.00 252K
qwen3-max (128K-252K) $3.00 $15.00 252K
qwen-max (older) $1.60 $6.40

All Qwen models support native tool-calling, JSON-mode, and OpenAI-compatible API shapes. Batch calling: 50% off. Context caching discounts available on supported models.


GLM / Z.ai

Current as of May 2026. Source: docs.z.ai

Flagship Models (GLM-5 Series)

Model Context SWE-Bench Input /1M Output /1M Cached Input License
GLM-5.1 203K Pro 58.4% (best-in-class) $1.40 $4.40 $0.26 MIT, 754B params
GLM-5 202K Verified 77.8% $1.00 $3.20 $0.20 MIT, 744B/40B MoE
GLM-5-Turbo 202K $1.20 $4.00 $0.24 Proprietary

GLM-5.1 (Apr 7 2026): 8-hour autonomous runs, 1,700 agentic steps. Surpasses GPT-5.4 and Claude Opus 4.6 on SWE-Bench Pro. GLM-5: 744B params, 40B active MoE, 28.5T token pretraining.

Previous Gen (GLM-4 Series)

Model Context Input /1M Output /1M Cached Input Notes
GLM-4.7 128K $0.60 $2.20 $0.11 73.8% SWE-Bench Verified
GLM-4.7-FlashX 203K $0.07 $0.40 $0.01 Fast inference variant
GLM-4.6 128K $0.60 $2.20 $0.11 Previous generation
GLM-4.5-X 128K $2.20 $8.90 $0.45 Premium tier
GLM-4.5 128K $0.60 $2.20 $0.11 Standard tier
GLM-4.5-Air 128K $0.20 $1.10 $0.03 Lightweight, Haiku-class
GLM-4.5-AirX 128K $1.10 $4.50 $0.22 Fast Air variant
GLM-4-32B-0414-128K 128K $0.10 $0.10 Budget open-weight

Free Models

Model Context Input Output
GLM-4.7-Flash 203K Free Free
GLM-4.5-Flash Free Free

Vision Models

Model Input /1M Output /1M Cached Input
GLM-5V-Turbo $1.20 $4.00 $0.24
GLM-4.6V $0.30 $0.90 $0.05
GLM-4.6V-FlashX $0.04 $0.40 $0.004
GLM-OCR $0.03 $0.03
GLM-4.6V-Flash Free Free Free

Xiaomi MiMo

Current as of May 2026. V2 launched Mar 18 2026, V2.5 launched Apr 22 2026. Source: mimo-v2.com

Model Input /1M Output /1M Context Modalities Notes
MiMo-V2-Pro (≤256K) $1.00 $3.00 1M Text 78.0% SWE-Bench. 1T params, 42B active
MiMo-V2-Pro (256K–1M) $2.00 $6.00 1M Text Long-context tier
MiMo-V2.5-Pro (≤256K) $1.00 ($0.20 cached) $3.00 1M Text Apr 22 2026. MIT license. 1T params. 57.2% SWE-Bench Pro
MiMo-V2.5-Pro (256K–1M) $2.00 $6.00 1M Text Long-context tier
MiMo-V2-Omni ~$1.00 ~$3.00 256K Text, Image, Audio, Video Multimodal flagship
MiMo-V2-Flash $0.10 $0.30 256K Text Open-source foundation model
MiMo-V2-TTS Free Free Audio Limited time promo

API at platform.xiaomimimo.com. OpenAI-compatible. Credit plans available: Lite $6/mo, Standard $16/mo, Pro $50/mo, Max $100/mo.


Kimi / Moonshot AI (K2.6)

Current as of May 2026. Source: kimi.com, OpenRouter

Both models: 1T params, 32B active MoE, 384 experts, MIT license.

Model Cache Hit /1M Cache Miss /1M Output /1M Context SWE-Bench
kimi-k2.6 $0.16 $0.95 $4.00 262K Verified 80.2%, Pro 58.6%, BrowseComp 83.2%
kimi-k2.5 $0.40 $1.90 256K Verified 76.8%, BrowseComp 78.4%

K2.6: 300 parallel sub-agents, 4,000+ tool calls, 12+ hr continuous execution. K2.5: 100 parallel sub-agents.

Membership Plans

Plan Price/mo Agent Usage
Adagio Free 6
Moderato $15 60
Allegretto $31 150
Allegro $79 360
Vivace $159 720

OpenCode Go

Source: docs.openclaw.ai. Dollar-value limits ($12/5h, $30/week, $60/month).

Available Models

Model Ref Name
opencode-go/glm-5 GLM-5
opencode-go/glm-5.1 GLM-5.1
opencode-go/kimi-k2.5 Kimi K2.5
opencode-go/kimi-k2.6 Kimi K2.6 (3x limits)
opencode-go/deepseek-v4-pro DeepSeek V4 Pro
opencode-go/deepseek-v4-flash DeepSeek V4 Flash
opencode-go/mimo-v2-omni MiMo V2 Omni
opencode-go/mimo-v2-pro MiMo V2 Pro
opencode-go/mimo-v2.5 MiMo V2.5
opencode-go/mimo-v2.5-pro MiMo V2.5 Pro
opencode-go/minimax-m2.5 MiniMax M2.5
opencode-go/minimax-m2.7 MiniMax M2.7
opencode-go/qwen3.5-plus Qwen3.5 Plus
opencode-go/qwen3.6-plus Qwen3.6 Plus
opencode-go/qwen3.6-max-preview Qwen3.6 Max Preview

Request Estimates (May 19 2026)

Model Per 5h Per Week Per Month
GLM-5.1 880 2,150 4,300
GLM-5 1,150 2,880 5,750
Kimi K2.5 1,850 4,630 9,250
MiMo-V2-Pro 1,290 3,225 6,450
MiMo-V2.5-Pro 1,290 3,225 6,450
MiMo-V2-Omni 2,150 5,450 10,900
Qwen3.6 Plus 3,300 8,200 16,300
Qwen3.6 Max Preview 820 2,050 4,100
MiniMax M2.7 3,400 8,500 17,000
MiniMax M2.5 6,300 15,900 31,800
Qwen3.5 Plus 10,200 25,200 50,500

MiniMax M2.5: 80.2% SWE-Bench — near Claude Opus 4.6 (80.8%).


Notes