Model pricing database

AI Model Pricing Comparison

Compare leading prices and choose the right cost structure for each workload.

Important pricing note

Disclaimer: All prices, token counts, forecasts, comparisons, and cost calculations are estimates for general planning only. They are not financial, tax, accounting, procurement, purchasing, or legal advice. AI providers may change pricing, billing units, model names, discounts, and terms at any time. Always verify current pricing on the provider's official pricing page. The official provider bill, billing dashboard, and invoice are the final source of truth.

Manual price maintenance

Model prices on this page are manually maintained planning data. They are not a live billing feed, may change after publication, and must be verified against each provider's official pricing page. The official provider bill, billing dashboard, and invoice are the final source of truth.

Provider / modelStatusInput / 1M tokensOutput / 1M tokensContextSourceBest for
GPT-5.5OpenAIlatestStandard$5.00Cached input: $0.50Last updated: Jun 18, 2026$30.00Last updated: Jun 18, 2026Short / longOfficial pricingFrontier agents, advanced product workflows, high-quality reasoning, and premium user experiencesStandard short context. Input $5.00 / cached $0.50 / output $30.00 per 1M tokens. Long-context standard pricing is higher: $10 input, $1 cached input, $45 output per 1M tokens.
GPT-5.5 ProOpenAIlatestPremium$30.00Last updated: Jun 18, 2026$180.00Last updated: Jun 18, 2026Short / longOfficial pricingPremium reasoning, complex agent execution, deep analysis, and high-value enterprise workflowsStandard short context. Input $30.00 / output $180.00 per 1M tokens. Long-context standard pricing is higher: $60 input and $270 output per 1M tokens. Cached input is not listed for this tier.
GPT-5.4OpenAIlatestStandard$2.50Cached input: $0.25Last updated: Jun 18, 2026$15.00Last updated: Jun 18, 2026Short / longOfficial pricingBalanced frontier quality for assistants, coding, product features, and agent workflowsStandard short context. Input $2.50 / cached $0.25 / output $15.00 per 1M tokens. Long-context standard pricing is higher: $5 input, $0.50 cached input, $22.50 output per 1M tokens.
GPT-5.4 miniOpenAIlatestStandard$0.75Cached input: $0.075Last updated: Jun 18, 2026$4.50Last updated: Jun 18, 2026ShortOfficial pricingHigh-volume SaaS features, support assistants, routing, coding helpers, and everyday AI workflowsStandard short context. Input $0.75 / cached $0.075 / output $4.50 per 1M tokens. Priority pricing is higher. Batch and Flex pricing are lower than Standard for eligible workloads.
GPT-5.4 nanoOpenAIlatestStandard$0.20Cached input: $0.02Last updated: Jun 18, 2026$1.25Last updated: Jun 18, 2026ShortOfficial pricingVery low-cost classification, extraction, lightweight summaries, and large-volume utility tasksStandard short context. Input $0.20 / cached $0.02 / output $1.25 per 1M tokens. Lowest-cost current OpenAI text tier in this model set; use for routing, extraction, classification, and simple transformations.
GPT-5.4 ProOpenAIlatestPremium$30.00Last updated: Jun 18, 2026$180.00Last updated: Jun 18, 2026Short / longOfficial pricingPremium quality planning, coding, enterprise assistants, and high-stakes AI product workflowsStandard short context. Input $30.00 / output $180.00 per 1M tokens. Long-context standard pricing is higher: $60 input and $270 output per 1M tokens. Cached input is not listed for this tier.
chat-latestOpenAIspecializedChatGPT category$5.00Cached input: $0.50Last updated: Jun 18, 2026$30.00Last updated: Jun 18, 2026VariesOfficial pricingPlanning ChatGPT-style assistant costs when the product uses the chat-latest categorySpecialized ChatGPT category. Input $5.00 / cached $0.50 / output $30.00 per 1M tokens. Use this for ChatGPT-style chat-latest estimates only. Production API model IDs may differ.
GPT-5.3 CodexOpenAIspecializedCodex$1.75Cached input: $0.175Last updated: Jun 18, 2026$14.00Last updated: Jun 18, 2026VariesOfficial pricingCoding agents, repository automation, developer tools, and code-heavy workflowsSpecialized Codex category. Input $1.75 / cached $0.175 / output $14.00 per 1M tokens. Specialized coding model. Use for code-agent budgeting rather than general chat budgeting.
GPT-4.1OpenAIlegacyLegacy$2.00Last updated: Jun 18, 2026$8.00Last updated: Jun 18, 20261MOfficial pricingLegacy forecasts, compatibility comparisons, and older GPT-4.1-era budget plansLegacy reference. Input $2.00 / output $8.00 per 1M tokens. Kept for users comparing older forecasts against newer GPT-5.x tiers.
GPT-4.1 miniOpenAIlegacyLegacy$0.40Last updated: Jun 18, 2026$1.60Last updated: Jun 18, 20261MOfficial pricingLegacy support copilots and older cost forecastsLegacy reference. Input $0.40 / output $1.60 per 1M tokens. Kept for backwards comparison; GPT-5.4 mini should usually be tested for new forecasts.
GPT-4.1 nanoOpenAIlegacyLegacy$0.10Last updated: Jun 18, 2026$0.40Last updated: Jun 18, 20261MOfficial pricingLegacy low-cost classification, extraction, summarization, and batch processingLegacy reference. Input $0.10 / output $0.40 per 1M tokens. Kept for backwards comparison with older low-cost OpenAI forecasts.
Claude Fable 5AnthropiclatestStandard$10.00Cached input: $1.00Last updated: Jun 18, 2026$50.00Last updated: Jun 18, 20261MOfficial pricingPremium frontier reasoning, agentic work, writing, coding, and complex enterprise assistantsBase input / cache hit / output. Input $10.00 / cached $1.00 / output $50.00 per 1M tokens. Anthropic also lists 5-minute and 1-hour cache write prices. This calculator uses base input, cache hit, and output prices.
Claude Mythos 5AnthropiclimitedLimited availability$10.00Cached input: $1.00Last updated: Jun 18, 2026$50.00Last updated: Jun 18, 20261MOfficial pricingPremium workflows where Mythos access is available and quality matters more than unit costBase input / cache hit / output. Input $10.00 / cached $1.00 / output $50.00 per 1M tokens. Limited availability. Verify account access before using this in production budgets.
Claude Opus 4.8AnthropiclatestStandard$5.00Cached input: $0.50Last updated: Jun 18, 2026$25.00Last updated: Jun 18, 20261MOfficial pricingPremium reasoning, writing, coding, and complex agent workflows with lower unit price than older Opus 4.1Base input / cache hit / output. Input $5.00 / cached $0.50 / output $25.00 per 1M tokens. Anthropic notes Opus 4.7 and later use a new tokenizer that may use more tokens for the same fixed text.
Claude Sonnet 4.6AnthropiclatestStandard$3.00Cached input: $0.30Last updated: Jun 18, 2026$15.00Last updated: Jun 18, 20261MOfficial pricingBalanced coding, knowledge work, analysis, and production assistantsBase input / cache hit / output. Input $3.00 / cached $0.30 / output $15.00 per 1M tokens. Good balanced Claude tier for production assistants, coding, and analysis.
Claude Haiku 4.5AnthropiclatestStandard$1.00Cached input: $0.10Last updated: Jun 18, 2026$5.00Last updated: Jun 18, 2026200KOfficial pricingFast assistants, support triage, short responses, lightweight automation, and cost-sensitive Claude workloadsBase input / cache hit / output. Input $1.00 / cached $0.10 / output $5.00 per 1M tokens. Lowest-cost current Claude tier in this dataset.
Claude Opus 4.1AnthropicdeprecatedDeprecated$15.00Cached input: $1.50Last updated: Jun 18, 2026$75.00Last updated: Jun 18, 2026200KOfficial pricingHistorical comparison against older premium Claude forecastsDeprecated reference. Input $15.00 / cached $1.50 / output $75.00 per 1M tokens. Anthropic marks Claude Opus 4.1 as deprecated. Kept only for old forecast comparison.
Claude Sonnet 4AnthropicretiredRetired except partner clouds$3.00Cached input: $0.30Last updated: Jun 18, 2026$15.00Last updated: Jun 18, 2026200KOfficial pricingHistorical comparison for teams migrating older Sonnet 4 workloadsRetired reference. Input $3.00 / cached $0.30 / output $15.00 per 1M tokens. Anthropic marks Sonnet 4 as retired except on Bedrock and Vertex AI.
Gemini 3.1 Pro PreviewGooglepreviewPreview$2.00Cached input: $0.20Last updated: Jun 18, 2026$12.00Last updated: Jun 18, 2026Up to 1MOfficial pricingLatest Gemini Pro planning, multimodal understanding, agentic workflows, and coding experimentsStandard, prompts <= 200K tokens. Input $2.00 / cached $0.20 / output $12.00 per 1M tokens. Prompts over 200K tokens cost more: $4 input, $0.40 cached input, $18 output per 1M tokens.
Gemini 3 Flash PreviewGooglepreviewPreview$0.50Cached input: $0.05Last updated: Jun 18, 2026$3.00Last updated: Jun 18, 2026Up to 1MOfficial pricingFast frontier-quality workflows, search-grounded experiences, and high-throughput product featuresStandard text / image / video. Input $0.50 / cached $0.05 / output $3.00 per 1M tokens. Preview model pricing may change before stable release.
Gemini 3.1 Flash-LiteGooglelatestStandard$0.25Cached input: $0.025Last updated: Jun 18, 2026$1.50Last updated: Jun 18, 2026Up to 1MOfficial pricingCost-efficient high-volume agentic tasks, translation, routing, and simple data processingStandard text / image / video. Input $0.25 / cached $0.025 / output $1.50 per 1M tokens. Google also lists Batch, Flex, and Priority pricing for this model.
Gemini 2.5 ProGooglelegacyLegacy reference$1.25Last updated: Jun 18, 2026$10.00Last updated: Jun 18, 20261MOfficial pricingLegacy long-context reasoning, research, multimodal analysis, and older workflowsLegacy reference. Input $1.25 / output $10.00 per 1M tokens. Kept for teams comparing older Gemini 2.5 forecasts with current Gemini 3.x models.
Gemini 2.5 FlashGooglelegacyLegacy reference$0.30Last updated: Jun 18, 2026$2.50Last updated: Jun 18, 20261MOfficial pricingLegacy fast multimodal apps, high-volume tasks, and general product workflowsLegacy reference. Input $0.30 / output $2.50 per 1M tokens. Kept for backwards comparison with Gemini 2.5-era forecasts.
DeepSeek V4 FlashDeepSeeklatestStandard$0.14Cached input: $0.003Last updated: Jun 18, 2026$0.28Last updated: Jun 18, 20261MOfficial pricingVery low-cost chat, coding, reasoning-style workflows, and cost-sensitive routingCache miss input / cache hit input / output. Input $0.14 / cached $0.003 / output $0.28 per 1M tokens. Supports non-thinking and thinking mode. Legacy deepseek-chat and deepseek-reasoner aliases map to this model family until deprecation.
DeepSeek V4 ProDeepSeeklatestStandard$0.44Cached input: $0.004Last updated: Jun 18, 2026$0.87Last updated: Jun 18, 20261MOfficial pricingHigher-capability low-cost reasoning, analysis, coding assistance, and long-context workflowsCache miss input / cache hit input / output. Input $0.435 / cached $0.004 / output $0.87 per 1M tokens. Higher-capability DeepSeek tier with 1M context and 384K max output.
DeepSeek ChatDeepSeekdeprecatedAlias until 2026-07-24 15:59 UTC$0.14Cached input: $0.003Last updated: Jun 18, 2026$0.28Last updated: Jun 18, 20261MOfficial pricingCompatibility with old chat integrations only; use DeepSeek V4 Flash for new forecastsCompatibility alias. Input $0.14 / cached $0.003 / output $0.28 per 1M tokens. DeepSeek says deepseek-chat will be deprecated on 2026-07-24 15:59 UTC and maps to DeepSeek V4 Flash non-thinking mode.
DeepSeek ReasonerDeepSeekdeprecatedAlias until 2026-07-24 15:59 UTC$0.14Cached input: $0.003Last updated: Jun 18, 2026$0.28Last updated: Jun 18, 20261MOfficial pricingCompatibility with old reasoning integrations only; use DeepSeek V4 Flash for new forecastsCompatibility alias. Input $0.14 / cached $0.003 / output $0.28 per 1M tokens. DeepSeek says deepseek-reasoner will be deprecated on 2026-07-24 15:59 UTC and maps to DeepSeek V4 Flash thinking mode.
Grok 4.3xAIlatestStandard$1.25Last updated: Jun 18, 2026$2.50Last updated: Jun 18, 20261MOfficial pricingGeneral chat, agentic tool calling, reasoning-enabled workflows, and product assistantsText model. Input $1.25 / output $2.50 per 1M tokens. xAI recommends Grok 4.3 for most text use cases and notes model aliases can move to newer versions.
Grok Build 0.1xAIlatestCoding$1.00Last updated: Jun 18, 2026$2.00Last updated: Jun 18, 2026256KOfficial pricingCoding agents, build tools, repository automation, and developer-focused AI productsCoding model. Input $1.00 / output $2.00 per 1M tokens. Fast coding model trained for agentic coding workflows.
Grok 3xAIlegacyLegacy reference$3.00Last updated: Jun 18, 2026$15.00Last updated: Jun 18, 2026131KOfficial pricingLegacy Grok forecasts and older product budgetsLegacy reference. Input $3.00 / output $15.00 per 1M tokens. Kept for teams comparing older Grok 3 forecasts with current Grok 4.3.

How to read model pricing

Most AI providers list separate prices for input tokens and output tokens. Input tokens are the instructions, context, documents, and user messages you send. Output tokens are generated by the model and are often the more expensive part of a workflow.

Example: cheap input can still become expensive

A model with low input pricing can still be costly if it generates long answers, requires retries, or needs long prompts to achieve acceptable quality. Compare the total workflow cost, not just the lowest price in one column.

When to use premium models

Premium models make sense for workflows where a better answer saves human review time, reduces failed tasks, or supports high-value customers. For routing, extraction, classification, and simple transformations, a smaller model is often enough.

Frequently asked questions

Which AI model is cheapest?

The cheapest model depends on the task and output length. Small or lite models are usually best for classification, extraction, and high-volume simple tasks.

Should I always choose the lowest token price?

No. A cheaper model can cost more if it needs retries, produces lower quality, or requires longer prompts. Compare total workflow cost, not only list price.

How often should model prices be checked?

Check official provider pricing before major launches, pricing changes, and budget reviews. AI model prices and names can change quickly.