Grok API Pricing in 2026: xAI's 2x Output Rule Changes the Math

Model	Input / 1M	Output / 1M	Context	Output multiplier
Grok 4.20	$1.25	$2.50	2M	2x
Grok 4.3	$1.25	$2.50	1M	2x
Grok Build 0.1	$1.00	$2.00	256K	2x

Model

Input / 1M

Output / 1M

Context

Output multiplier

Grok 4.20

$1.25

$2.50

Grok 4.3

$1.25

$2.50

Grok Build 0.1

$1.00

$2.00

256K

The pricing quirk everyone misses

Every provider charges more for output tokens than input tokens, but the ratio varies wildly: Claude holds 5x across its lineup, OpenAI runs 6x on its current models, Gemini 6x. xAI charges 2x.

That single number rearranges the rankings for any output-heavy workload. The intuition for why is in the output multiplier piece, but here's the concrete version: generate a 5,000-token report from a 500-token brief, and on GPT-5.4 you pay $0.00125 input + $0.075 output — the output is 98% of the bill. The same job on Grok 4.3: $0.000625 + $0.0125. Grok ends up 6x cheaper on that task while being only 2x cheaper on input price.

The lineup: 4.3, 4.20, and Build

Grok 4.3 ($1.25 / $2.50, 1M context) is the general-purpose flagship at a mid-tier price — it sits between Gemini 3.5 Flash and GPT-5.4 on cost.

Grok 4.20 (same $1.25 / $2.50) is the interesting one: a 2M-token context window, the largest I track on a mainstream API, and an Arena score of 72 that puts it within three points of the frontier trio. Two million tokens is roughly 1.5 million words — a small codebase or a year of meeting transcripts in a single request.

Grok Build 0.1 ($1 / $2, 256K context) is the coding specialist. I compared it head-to-head against Claude's fast tier in the Grok Build vs Opus 4.8 Fast piece — the short version is that Build's economics shine for code generation volume, while Opus wins on the hardest debugging.

What the 2M context window really costs

A window that size deserves the same sober math I apply to every long-context model: filling 2M tokens of context costs $2.50 per request on Grok 4.20, every request, because input bills on every send. A 20-turn conversation that keeps 500K tokens in context pays for roughly 10M input tokens of resends — $12.50.

The honest use case for giant windows isn't conversation; it's single-shot analysis. One request, the entire corpus, one comprehensive answer: $2.50 of input is a bargain next to building a retrieval pipeline. For iterative work, retrieval still wins, as covered in the 1M-token context piece.

Where Grok wins on real workloads

Content generation at scale: drafting 10,000 product descriptions a month (200 tokens in, 600 out each) costs $2.50 input + $15 output = $17.50 on Grok 4.3, versus $3 + $54 = $57 on Gemini 3.5 Flash and $1.50 + $27 = $28.50 on GPT-5.4-mini. The 2x multiplier does the work.

Report and email generation, code scaffolding, data-to-text pipelines — anywhere the output is several times longer than the input, Grok's effective price lands near the budget tier while its quality score sits near the mid tier.

Where it doesn't win: input-dominated workloads. RAG with fat retrieved context, document classification, summarization of long inputs — there the $1.25 input rate faces Gemini Flash-Lite at $0.25 and DeepSeek V4 Flash at $0.10, and Grok is the expensive option. Match the multiplier to your input:output ratio; the calculator does this arithmetic for you.

Caveats and the bottom line

The caveats: xAI iterates in public, with beta-suffixed models appearing and graduating quickly — pin versions and keep a fallback. The ecosystem around tooling and enterprise compliance is younger than OpenAI's or Google's. And quality at the very top end still trails the frontier trio by a few Arena points, consistent with its mid-tier price.

The bottom line I'd give a friend: if your workload writes more than it reads, Grok is probably the cheapest credible option you're not considering, and Grok 4.20's quality score of 72 makes it more than a curiosity — see where it lands against Sonnet and the live pricing table for current numbers.

Frequently Asked Questions

How much does the Grok API cost?

As of June 2026: Grok 4.3 and Grok 4.20 both cost $1.25 per million input tokens and $2.50 per million output tokens; Grok Build 0.1 (coding) costs $1 / $2. A typical chat message costs about three-tenths of a cent.

Why does Grok's 2x output multiplier matter?

Most providers charge 5-6x more for output than input; xAI charges only 2x. For output-heavy work — long generations from short prompts — this makes Grok several times cheaper than nominally similar-priced rivals, because output dominates those bills.

What is the largest context window on the Grok API?

Grok 4.20 offers a 2M-token context window — the largest mainstream API window as of June 2026, roughly 1.5 million words per request. Filling it costs about $2.50 in input tokens per request.

Is Grok good for coding?

Grok Build 0.1 is xAI's dedicated coding model at $1/$2 per million tokens with a 256K context. Its economics are strong for high-volume code generation; for the hardest debugging and agentic work, frontier models still hold an edge.