Models

A Claude AI model is one of Anthropic’s Opus, Sonnet, or Haiku systems: start with Sonnet 4.6 for most work, move simple high-volume tasks to Haiku 4.5, and reserve Opus 4.7 for the hardest reasoning, coding, and long-context analysis; use our Claude AI guide for the broader ecosystem.

Claude AI Models — Opus, Sonnet & Haiku Compared — hero illustration.
Claude AI Models — Opus, Sonnet & Haiku Compared

Table of contents

  • Independent guide · c-ai.chat is not Anthropic.
  • Official product · use claude.ai for Claude itself.
  • Three main families · Opus, Sonnet, and Haiku.
  • API billing · priced per million input and output tokens.

The current Claude lineup

The lineup has three main choices. Opus is the flagship tier. Sonnet is the default for most users and applications. Haiku is the fast, low-cost tier for simple tasks. Anthropic lists official model details in its model overview and publishes pricing at claude.com/pricing.

Claude Sonnet 4.6

$3/M input · $15/M output tokens

Best default for writing, analysis, coding, agents, and everyday professional work.

  • Role: best balance of quality, speed, and cost
  • Context window: 1,000,000 tokens
  • Max output: 128,000 tokens

Claude Haiku 4.5

$1/M input · $5/M output tokens

Fast, low-cost model for routing, extraction, classification, and short automation.

  • Role: speed and cost control
  • Best for clear, repeated, low-risk tasks
  • Use when long context is not the main requirement

Model choice affects quality, latency, and cost. It also changes how you design prompts, how much context you send, and whether you route tasks between models. If you are choosing Claude by product capability rather than model capability, use our Claude features guide with this page.

Family Current model Primary role API price Context and output note
Opus Claude Opus 4.7 Flagship quality for difficult work $5/M input · $25/M output tokens 1,000,000-token context
Sonnet Claude Sonnet 4.6 Recommended default $3/M input · $15/M output tokens 1,000,000-token context · 128,000-token max output
Haiku Claude Haiku 4.5 Fast, low-cost automation $1/M input · $5/M output tokens Best when the task is short, structured, and repeated

How the families differ

Concept diagram for claude ai model
Concept diagram for claude ai model

Opus, Sonnet, and Haiku are separate capability and cost tiers. Opus gives the most headroom for hard reasoning, nuanced writing, complex code work, and multi-document synthesis. Test Opus when the inputs are messy, the answer must be careful, or the cost of a wrong answer is high.

Sonnet is the practical center of the lineup. It is strong enough for most business writing, research support, software development, data analysis, and agent workflows, while costing less than Opus in the API. Most teams should start with Sonnet, measure failures, and move only the hard cases upward.

Haiku is the speed and cost model. Use it for classification, extraction, short summaries, routing, tagging, support triage, and other repeated tasks with clear instructions. Do not use Haiku as the first choice for subtle analysis, difficult architecture decisions, or prompts with many competing constraints.

Good fit by family

  • Opus 4.7: hard reasoning, complex code review, high-stakes synthesis, long documents, ambiguous instructions.
  • Sonnet 4.6: everyday chat, writing, analysis, coding, agents, document work, product integrations.
  • Haiku 4.5: fast extraction, classification, routing, tagging, short summaries, high-volume background jobs.

Change model when

  • Opus 4.7: the task is simple enough that the extra quality does not justify the higher API cost.
  • Sonnet 4.6: repeated failures show that the task needs more reasoning headroom or a cheaper batch route.
  • Haiku 4.5: the work needs nuance, long-context analysis, or careful multi-step judgment.

The common mistake is treating the flagship model as the default for every task. That can hide weak prompt design and raise costs without improving outcomes. A better pattern is Sonnet for the main workflow, Haiku for simple sub-tasks, and Opus for escalation.

Context windows and output limits

Abstract scene of using Claude AI
Abstract scene of using Claude AI

A context window is the amount of material Claude can consider in one request: instructions, chat history, file text, retrieved passages, tool output, and response budget. In the current lineup covered here, Opus 4.7 and Sonnet 4.6 have 1,000,000-token context. Sonnet 4.6 also lists a 128,000-token max output.

1,000,000 tokens

context for Opus 4.7 and Sonnet 4.6

128,000 tokens

max output for Sonnet 4.6

Long context helps with large codebases, contracts, research sets, logs, and document collections. It is not a substitute for structure. If you paste an entire archive into a prompt, Claude may still miss details, over-weight irrelevant sections, or spend tokens on material that should have been filtered first.

  • Use long context when the model must compare many sections at once.
  • Use retrieval when only a small part of a large corpus is relevant.
  • Ask for citations or section references when accuracy depends on source text.
  • Keep output instructions tight, because output tokens are the expensive side of most requests.
  • Test shorter prompts against long-context prompts before assuming bigger is better.

What each model costs

Claude API pricing is based on input tokens and output tokens. Input tokens are the text, files, retrieved context, tool results, and instructions you send to the model. Output tokens are the text the model generates. Anthropic publishes official pricing at claude.com/pricing and in the Claude platform pricing docs.

For subscriptions, annual billing, team plans, and API billing basics, use our Claude pricing guide. If you are building with model IDs, request bodies, prompt caching, or streaming, use our Claude API docs guide with Anthropic’s official docs.

Model Input price Output price Cost profile Practical use
Claude Opus 4.7 $5/M tokens $25/M tokens Highest cost in the current lineup Reserve for difficult work, final review, high-stakes reasoning, and complex code tasks.
Claude Sonnet 4.6 $3/M tokens $15/M tokens Middle cost, strong quality Use as the default for most professional and developer workflows.
Claude Haiku 4.5 $1/M tokens $5/M tokens Lowest cost in the current lineup Use for fast, repeated, simple, or background tasks.

Claude web plans and API rates are separate. A claude.ai subscription controls product access and usage allowances. API usage is billed by model and token volume unless your account terms say otherwise.

Plan Published price Model-choice note
Free $0 Good for trying Claude, not for estimating API spend.
Pro $20/mo or $17/mo with annual billing Personal plan pricing is separate from API token pricing.
Max From $100/mo Higher product access does not remove API token costs.
Team Standard $25/seat or $20/seat with annual billing Team plan pricing is separate from model API rates.
Team Premium $125/seat or $100/seat with annual billing Use official pricing for plan features and limits.
Enterprise $20/seat base + API rates Budget for both the seat base and model usage.

90% off

cached input tokens with prompt caching

50% off

input and output tokens with the Batch API

Prompt caching is most useful when many requests share the same long instructions, reference documents, schemas, or tool descriptions. It applies to cached input tokens, not every token. Batch API pricing can reduce costs for eligible non-urgent workloads, but it is not a fit for real-time chat.

Optimization lever What it reduces When to use it Limitation
Prompt caching Cached input tokens Repeated prompts with shared instructions, policies, reference text, or schemas It applies to cached input, not all tokens.
Batch API Input and output tokens on eligible batch jobs Offline jobs, bulk summarization, data processing, and non-urgent automation It is not for real-time chat or interactive flows.
Model routing Unnecessary use of larger models Workflows with easy cases, hard cases, and clear escalation rules It requires evaluation so routing does not lower quality.
Shorter outputs Output tokens Reports, summaries, extraction, and structured answers Too much compression can remove useful detail.

Older versions and version history

Abstract ecosystem illustration
Abstract ecosystem illustration

Claude model names combine a family name with a version. Older docs, saved workflows, vendor integrations, and example code may still mention previous model IDs. Check Anthropic’s official model overview before copying an old ID into production.

Reference How to treat it Action for users
Opus 4.7 Current flagship model Evaluate for the hardest tasks, long-context work, and final review.
Opus 4.6 Previous flagship at $5/M input and $25/M output tokens Keep only when a pinned workflow still performs well; compare with Opus 4.7 before new work.
Sonnet 4.6 Current default model for most work Start here for most API and claude.ai workflows.
Haiku 4.5 Current speed and cost model Use for clear, repeated, low-risk automation.
Claude 4.0 or 3.x references Older examples or legacy integrations Check deprecation notes, migration guidance, and test results before relying on them.

Deprecation policy matters for production systems. Anthropic can retire older model IDs, change aliases, or recommend migration paths. Avoid silent model changes. Pin deliberate versions where appropriate, monitor official notices, and test migrations before switching high-value workflows.

Which model should you use?

The safest rule is simple: start with Claude Sonnet 4.6, move simple repeated work down to Claude Haiku 4.5, and move difficult or high-stakes work up to Claude Opus 4.7.

  1. Start with Sonnet.

    Use Sonnet 4.6 for the first version of a workflow. Measure answer quality, latency, and cost before changing models.

  2. Move simple sub-tasks to Haiku.

    Route classification, tagging, extraction, and short summaries to Haiku 4.5 when the instructions are clear and the risk is low.

  3. Escalate hard cases to Opus.

    Send prompts to Opus 4.7 when Sonnet struggles with reasoning, ambiguity, multi-file code review, or high-stakes synthesis.

  4. Use real prompts for testing.

    Benchmarks help, but your own data matters more. Test the prompts, files, tools, and edge cases your users will actually send.

  5. Review the model mix regularly.

    Usage patterns change. Recheck routing rules, cached prompts, output length, and model quality after major workflow changes.

Example routing setup

Support inbox triage

Use Haiku 4.5 to classify incoming tickets, Sonnet 4.6 to draft replies, and Opus 4.7 to review edge cases involving refunds, policy exceptions, or legal risk.

Pick when

  • Opus 4.7: legal-style review, complex debugging, architecture trade-offs, long research synthesis, difficult planning.
  • Sonnet 4.6: product chat, internal assistants, writing, data analysis, agents, most coding, document workflows.
  • Haiku 4.5: routing, classification, extraction, short summaries, support triage, bulk low-risk processing.

Skip when

  • Opus 4.7: the task is routine, the budget is tight, or faster cheaper routing works.
  • Sonnet 4.6: the task is either trivial enough for Haiku or difficult enough to justify Opus.
  • Haiku 4.5: the answer must handle nuance, long context, complex coding, or costly judgment calls.
Workload Start with Move up when Move down when
General business writing Sonnet 4.6 The output needs more nuance, strategy, or final-review quality. The task is template-based or only needs short transformations.
Software development Sonnet 4.6 The task involves multi-file reasoning, difficult bugs, or architecture trade-offs. The task is simple formatting, extraction, or code classification.
Claude Code workflows Sonnet 4.6 The session needs more reasoning across a large codebase. The task is simple explanation or repeated boilerplate support.
Document analysis Sonnet 4.6 The corpus is large, conflicting, or high risk. The task is extraction into a fixed schema.
Support triage Haiku 4.5 Customer intent is ambiguous or escalation quality matters. The task is simple tagging or routing.
Research synthesis Sonnet 4.6 The source set is long, dense, or decision-critical. The request only needs short summaries of known material.

For coding-heavy work, model choice is only part of the decision. Tool access, repository context, terminal behavior, and review flow also matter. Use our Claude resources for practical workflows and implementation notes.

Honest take

Do not start with the largest Claude AI model by default. Ask which model is reliable enough for the task. For most professional use, that answer is Sonnet 4.6. Use Haiku 4.5 for simple volume work. Use Opus 4.7 when extra capability changes the outcome.

The lineup is clear, but model names do not replace testing. Real prompts, real files, and real failure cases will tell you more than a label. For quick questions about accounts, limits, and plan choices, use our Claude FAQ.

Use the official product — Open Claude at claude.ai when you want to test model behavior directly.

Open Claude

FAQ

What is a Claude AI model?

A Claude AI model is one of Anthropic’s Claude model options. The main families are Opus, Sonnet, and Haiku. Each has a different balance of capability, speed, and API cost.

Which Claude model is best for most users?

Claude Sonnet 4.6 is the best default for most users. It balances reasoning quality, writing ability, coding support, speed, and cost better than starting every task on Opus or Haiku.

Is Claude Opus 4.7 better than Sonnet 4.6?

Opus 4.7 is the flagship model and is the stronger choice for the hardest prompts. Sonnet 4.6 is the better default for most work because it costs less in the API and is strong enough for many professional tasks.

What is Claude Haiku 4.5 for?

Haiku 4.5 is for fast, low-cost tasks. Common examples include classification, extraction, routing, tagging, support triage, and short summaries. Use Sonnet or Opus when the task needs more reasoning or nuance.

How much do Claude models cost in the API?

Claude Opus 4.7 costs $5/M input tokens and $25/M output tokens. Claude Sonnet 4.6 costs $3/M input tokens and $15/M output tokens. Claude Haiku 4.5 costs $1/M input tokens and $5/M output tokens. Check the official pricing page before budgeting.

Which Claude models support 1,000,000-token context?

This guide lists 1,000,000-token context for Claude Opus 4.7 and Claude Sonnet 4.6. Sonnet 4.6 also lists a 128,000-token max output. Use Anthropic’s official model overview for current model IDs, limits, and availability notes.

Are older Claude model IDs safe to use in production?

They can be, but only if you track Anthropic’s deprecation policy and test migrations. Do not rely on an old model ID just because an example or integration still shows it.

Are Claude plans and API pricing the same?

No. Claude plans control product access and usage allowances. API pricing bills by input and output tokens. Enterprise pricing includes a $20/seat base plus API rates.

Sources

Independent guide. Not affiliated with Anthropic. For the official Claude product, visit claude.ai.