Model pricing database

AI Model Pricing Comparison

Q: Which AI model is cheapest?

The cheapest model depends on the task and output length. Small or lite models are usually best for classification, extraction, and high-volume simple tasks.

Q: Should I always choose the lowest token price?

No. A cheaper model can cost more if it needs retries, produces lower quality, or requires longer prompts. Compare total workflow cost, not only list price.

Q: How often should model prices be checked?

Check official provider pricing before major launches, pricing changes, and budget reviews. AI model prices and names can change quickly.

Compare leading prices and choose the right cost structure for each workload.

Important pricing note

Disclaimer: All prices, token counts, forecasts, comparisons, and cost calculations are estimates for general planning only. They are not financial, tax, accounting, procurement, purchasing, or legal advice. AI providers may change pricing, billing units, model names, discounts, and terms at any time. Always verify current pricing on the provider's official pricing page. The official provider bill, billing dashboard, and invoice are the final source of truth.

Manual price maintenance

Model prices on this page are manually maintained planning data. They are not a live billing feed, may change after publication, and must be verified against each provider's official pricing page. The official provider bill, billing dashboard, and invoice are the final source of truth.

Provider / model	Status	Input / 1M tokens	Output / 1M tokens	Context	Source	Best for
GPT-5.5OpenAI	latestStandard	$5.00Cached input: $0.50Last updated: Jun 18, 2026	$30.00Last updated: Jun 18, 2026	Short / long	Official pricing	Frontier agents, advanced product workflows, high-quality reasoning, and premium user experiencesStandard short context. Input $5.00 / cached $0.50 / output $30.00 per 1M tokens. Long-context standard pricing is higher: $10 input, $1 cached input, $45 output per 1M tokens.
GPT-5.5 ProOpenAI	latestPremium	$30.00Last updated: Jun 18, 2026	$180.00Last updated: Jun 18, 2026	Short / long	Official pricing	Premium reasoning, complex agent execution, deep analysis, and high-value enterprise workflowsStandard short context. Input $30.00 / output $180.00 per 1M tokens. Long-context standard pricing is higher: $60 input and $270 output per 1M tokens. Cached input is not listed for this tier.
GPT-5.4OpenAI	latestStandard	$2.50Cached input: $0.25Last updated: Jun 18, 2026	$15.00Last updated: Jun 18, 2026	Short / long	Official pricing	Balanced frontier quality for assistants, coding, product features, and agent workflowsStandard short context. Input $2.50 / cached $0.25 / output $15.00 per 1M tokens. Long-context standard pricing is higher: $5 input, $0.50 cached input, $22.50 output per 1M tokens.
GPT-5.4 miniOpenAI	latestStandard	$0.75Cached input: $0.075Last updated: Jun 18, 2026	$4.50Last updated: Jun 18, 2026	Short	Official pricing	High-volume SaaS features, support assistants, routing, coding helpers, and everyday AI workflowsStandard short context. Input $0.75 / cached $0.075 / output $4.50 per 1M tokens. Priority pricing is higher. Batch and Flex pricing are lower than Standard for eligible workloads.
GPT-5.4 nanoOpenAI	latestStandard	$0.20Cached input: $0.02Last updated: Jun 18, 2026	$1.25Last updated: Jun 18, 2026	Short	Official pricing	Very low-cost classification, extraction, lightweight summaries, and large-volume utility tasksStandard short context. Input $0.20 / cached $0.02 / output $1.25 per 1M tokens. Lowest-cost current OpenAI text tier in this model set; use for routing, extraction, classification, and simple transformations.
GPT-5.4 ProOpenAI	latestPremium	$30.00Last updated: Jun 18, 2026	$180.00Last updated: Jun 18, 2026	Short / long	Official pricing	Premium quality planning, coding, enterprise assistants, and high-stakes AI product workflowsStandard short context. Input $30.00 / output $180.00 per 1M tokens. Long-context standard pricing is higher: $60 input and $270 output per 1M tokens. Cached input is not listed for this tier.
chat-latestOpenAI	specializedChatGPT category	$5.00Cached input: $0.50Last updated: Jun 18, 2026	$30.00Last updated: Jun 18, 2026	Varies	Official pricing	Planning ChatGPT-style assistant costs when the product uses the chat-latest categorySpecialized ChatGPT category. Input $5.00 / cached $0.50 / output $30.00 per 1M tokens. Use this for ChatGPT-style chat-latest estimates only. Production API model IDs may differ.
GPT-5.3 CodexOpenAI	specializedCodex	$1.75Cached input: $0.175Last updated: Jun 18, 2026	$14.00Last updated: Jun 18, 2026	Varies	Official pricing	Coding agents, repository automation, developer tools, and code-heavy workflowsSpecialized Codex category. Input $1.75 / cached $0.175 / output $14.00 per 1M tokens. Specialized coding model. Use for code-agent budgeting rather than general chat budgeting.
GPT-4.1OpenAI	legacyLegacy	$2.00Last updated: Jun 18, 2026	$8.00Last updated: Jun 18, 2026	1M	Official pricing	Legacy forecasts, compatibility comparisons, and older GPT-4.1-era budget plansLegacy reference. Input $2.00 / output $8.00 per 1M tokens. Kept for users comparing older forecasts against newer GPT-5.x tiers.
GPT-4.1 miniOpenAI	legacyLegacy	$0.40Last updated: Jun 18, 2026	$1.60Last updated: Jun 18, 2026	1M	Official pricing	Legacy support copilots and older cost forecastsLegacy reference. Input $0.40 / output $1.60 per 1M tokens. Kept for backwards comparison; GPT-5.4 mini should usually be tested for new forecasts.
GPT-4.1 nanoOpenAI	legacyLegacy	$0.10Last updated: Jun 18, 2026	$0.40Last updated: Jun 18, 2026	1M	Official pricing	Legacy low-cost classification, extraction, summarization, and batch processingLegacy reference. Input $0.10 / output $0.40 per 1M tokens. Kept for backwards comparison with older low-cost OpenAI forecasts.
Claude Fable 5Anthropic	latestStandard	$10.00Cached input: $1.00Last updated: Jun 18, 2026	$50.00Last updated: Jun 18, 2026	1M	Official pricing	Premium frontier reasoning, agentic work, writing, coding, and complex enterprise assistantsBase input / cache hit / output. Input $10.00 / cached $1.00 / output $50.00 per 1M tokens. Anthropic also lists 5-minute and 1-hour cache write prices. This calculator uses base input, cache hit, and output prices.
Claude Mythos 5Anthropic	limitedLimited availability	$10.00Cached input: $1.00Last updated: Jun 18, 2026	$50.00Last updated: Jun 18, 2026	1M	Official pricing	Premium workflows where Mythos access is available and quality matters more than unit costBase input / cache hit / output. Input $10.00 / cached $1.00 / output $50.00 per 1M tokens. Limited availability. Verify account access before using this in production budgets.
Claude Opus 4.8Anthropic	latestStandard	$5.00Cached input: $0.50Last updated: Jun 18, 2026	$25.00Last updated: Jun 18, 2026	1M	Official pricing	Premium reasoning, writing, coding, and complex agent workflows with lower unit price than older Opus 4.1Base input / cache hit / output. Input $5.00 / cached $0.50 / output $25.00 per 1M tokens. Anthropic notes Opus 4.7 and later use a new tokenizer that may use more tokens for the same fixed text.
Claude Sonnet 4.6Anthropic	latestStandard	$3.00Cached input: $0.30Last updated: Jun 18, 2026	$15.00Last updated: Jun 18, 2026	1M	Official pricing	Balanced coding, knowledge work, analysis, and production assistantsBase input / cache hit / output. Input $3.00 / cached $0.30 / output $15.00 per 1M tokens. Good balanced Claude tier for production assistants, coding, and analysis.
Claude Haiku 4.5Anthropic	latestStandard	$1.00Cached input: $0.10Last updated: Jun 18, 2026	$5.00Last updated: Jun 18, 2026	200K	Official pricing	Fast assistants, support triage, short responses, lightweight automation, and cost-sensitive Claude workloadsBase input / cache hit / output. Input $1.00 / cached $0.10 / output $5.00 per 1M tokens. Lowest-cost current Claude tier in this dataset.
Claude Opus 4.1Anthropic	deprecatedDeprecated	$15.00Cached input: $1.50Last updated: Jun 18, 2026	$75.00Last updated: Jun 18, 2026	200K	Official pricing	Historical comparison against older premium Claude forecastsDeprecated reference. Input $15.00 / cached $1.50 / output $75.00 per 1M tokens. Anthropic marks Claude Opus 4.1 as deprecated. Kept only for old forecast comparison.
Claude Sonnet 4Anthropic	retiredRetired except partner clouds	$3.00Cached input: $0.30Last updated: Jun 18, 2026	$15.00Last updated: Jun 18, 2026	200K	Official pricing	Historical comparison for teams migrating older Sonnet 4 workloadsRetired reference. Input $3.00 / cached $0.30 / output $15.00 per 1M tokens. Anthropic marks Sonnet 4 as retired except on Bedrock and Vertex AI.
Gemini 3.1 Pro PreviewGoogle	previewPreview	$2.00Cached input: $0.20Last updated: Jun 18, 2026	$12.00Last updated: Jun 18, 2026	Up to 1M	Official pricing	Latest Gemini Pro planning, multimodal understanding, agentic workflows, and coding experimentsStandard, prompts <= 200K tokens. Input $2.00 / cached $0.20 / output $12.00 per 1M tokens. Prompts over 200K tokens cost more: $4 input, $0.40 cached input, $18 output per 1M tokens.
Gemini 3 Flash PreviewGoogle	previewPreview	$0.50Cached input: $0.05Last updated: Jun 18, 2026	$3.00Last updated: Jun 18, 2026	Up to 1M	Official pricing	Fast frontier-quality workflows, search-grounded experiences, and high-throughput product featuresStandard text / image / video. Input $0.50 / cached $0.05 / output $3.00 per 1M tokens. Preview model pricing may change before stable release.
Gemini 3.1 Flash-LiteGoogle	latestStandard	$0.25Cached input: $0.025Last updated: Jun 18, 2026	$1.50Last updated: Jun 18, 2026	Up to 1M	Official pricing	Cost-efficient high-volume agentic tasks, translation, routing, and simple data processingStandard text / image / video. Input $0.25 / cached $0.025 / output $1.50 per 1M tokens. Google also lists Batch, Flex, and Priority pricing for this model.
Gemini 2.5 ProGoogle	legacyLegacy reference	$1.25Last updated: Jun 18, 2026	$10.00Last updated: Jun 18, 2026	1M	Official pricing	Legacy long-context reasoning, research, multimodal analysis, and older workflowsLegacy reference. Input $1.25 / output $10.00 per 1M tokens. Kept for teams comparing older Gemini 2.5 forecasts with current Gemini 3.x models.
Gemini 2.5 FlashGoogle	legacyLegacy reference	$0.30Last updated: Jun 18, 2026	$2.50Last updated: Jun 18, 2026	1M	Official pricing	Legacy fast multimodal apps, high-volume tasks, and general product workflowsLegacy reference. Input $0.30 / output $2.50 per 1M tokens. Kept for backwards comparison with Gemini 2.5-era forecasts.
DeepSeek V4 FlashDeepSeek	latestStandard	$0.14Cached input: $0.003Last updated: Jun 18, 2026	$0.28Last updated: Jun 18, 2026	1M	Official pricing	Very low-cost chat, coding, reasoning-style workflows, and cost-sensitive routingCache miss input / cache hit input / output. Input $0.14 / cached $0.003 / output $0.28 per 1M tokens. Supports non-thinking and thinking mode. Legacy deepseek-chat and deepseek-reasoner aliases map to this model family until deprecation.
DeepSeek V4 ProDeepSeek	latestStandard	$0.44Cached input: $0.004Last updated: Jun 18, 2026	$0.87Last updated: Jun 18, 2026	1M	Official pricing	Higher-capability low-cost reasoning, analysis, coding assistance, and long-context workflowsCache miss input / cache hit input / output. Input $0.435 / cached $0.004 / output $0.87 per 1M tokens. Higher-capability DeepSeek tier with 1M context and 384K max output.
DeepSeek ChatDeepSeek	deprecatedAlias until 2026-07-24 15:59 UTC	$0.14Cached input: $0.003Last updated: Jun 18, 2026	$0.28Last updated: Jun 18, 2026	1M	Official pricing	Compatibility with old chat integrations only; use DeepSeek V4 Flash for new forecastsCompatibility alias. Input $0.14 / cached $0.003 / output $0.28 per 1M tokens. DeepSeek says deepseek-chat will be deprecated on 2026-07-24 15:59 UTC and maps to DeepSeek V4 Flash non-thinking mode.
DeepSeek ReasonerDeepSeek	deprecatedAlias until 2026-07-24 15:59 UTC	$0.14Cached input: $0.003Last updated: Jun 18, 2026	$0.28Last updated: Jun 18, 2026	1M	Official pricing	Compatibility with old reasoning integrations only; use DeepSeek V4 Flash for new forecastsCompatibility alias. Input $0.14 / cached $0.003 / output $0.28 per 1M tokens. DeepSeek says deepseek-reasoner will be deprecated on 2026-07-24 15:59 UTC and maps to DeepSeek V4 Flash thinking mode.
Grok 4.3xAI	latestStandard	$1.25Last updated: Jun 18, 2026	$2.50Last updated: Jun 18, 2026	1M	Official pricing	General chat, agentic tool calling, reasoning-enabled workflows, and product assistantsText model. Input $1.25 / output $2.50 per 1M tokens. xAI recommends Grok 4.3 for most text use cases and notes model aliases can move to newer versions.
Grok Build 0.1xAI	latestCoding	$1.00Last updated: Jun 18, 2026	$2.00Last updated: Jun 18, 2026	256K	Official pricing	Coding agents, build tools, repository automation, and developer-focused AI productsCoding model. Input $1.00 / output $2.00 per 1M tokens. Fast coding model trained for agentic coding workflows.
Grok 3xAI	legacyLegacy reference	$3.00Last updated: Jun 18, 2026	$15.00Last updated: Jun 18, 2026	131K	Official pricing	Legacy Grok forecasts and older product budgetsLegacy reference. Input $3.00 / output $15.00 per 1M tokens. Kept for teams comparing older Grok 3 forecasts with current Grok 4.3.

How to read model pricing

Most AI providers list separate prices for input tokens and output tokens. Input tokens are the instructions, context, documents, and user messages you send. Output tokens are generated by the model and are often the more expensive part of a workflow.

Example: cheap input can still become expensive

A model with low input pricing can still be costly if it generates long answers, requires retries, or needs long prompts to achieve acceptable quality. Compare the total workflow cost, not just the lowest price in one column.

When to use premium models

Premium models make sense for workflows where a better answer saves human review time, reduces failed tasks, or supports high-value customers. For routing, extraction, classification, and simple transformations, a smaller model is often enough.

Frequently asked questions

Which AI model is cheapest?