The numbers, because they sound made up
When the gap is 50-100x, the usual model-selection framework inverts. The question stops being 'is the cheap model good enough to save money' and becomes 'is the expensive model better enough to justify 100x.' For a real slice of workloads — drafting, extraction, summarization, internal tools — I've found the honest answer is no.
The lineup: Flash, Pro, and R1
R1 ($0.70 / $2.50) is the reasoning specialist, the lineage that made DeepSeek famous in early 2025. Like all reasoning models it emits thinking tokens billed as output, so real costs per task run higher than the sticker suggests — the same dynamic as extended thinking on Claude, at a tiny fraction of the rate. My earlier deep-dive on R1 as a bargain reasoning model still holds; for budget reasoning overall, see the best budget reasoning picks.
Older V3.x models remain listed at sub-dollar prices but V4 Flash has mostly obsoleted them on both price and quality.
Why so cheap — and is it sustainable?
Sustainability is the fair question. My take from watching the feed daily: prices have been stable-to-falling for eighteen months, and because the weights are open, even a unilateral price hike wouldn't strand you — third-party hosts serve the same models at similar rates. The floor is set by competition among hosts, not by one company's pricing committee. That's a structural protection closed models can't offer, a theme I expand in the open-weight vs proprietary cost guide.
The caveats that actually matter
Operations second: the first-party API enforces tighter rate limits than US hyperscalers, and I've observed latency variance during Asia-Pacific peak hours that would bother an interactive product. Batch and overnight pipelines won't notice.
Jurisdiction third, stated plainly: DeepSeek is a Chinese company and the first-party API processes data under Chinese jurisdiction. For some compliance regimes that's disqualifying. The standard mitigation is using the open weights through a Western host (via OpenRouter, Together, Fireworks, and similar) at modestly higher prices — you keep most of the discount and move the data question to a provider you can contract with. I priced that route in the OpenRouter vs direct API comparison.
Where DeepSeek fits in a real stack
Concrete example I priced this week: a log-triage pipeline classifying 5M events a month (400 tokens in, 30 out each). V4 Flash: about $230/month. GPT-5.4-nano: $588. Gemini Flash-Lite: $725. Haiku 4.5: $2,750. Same task, same month, an order of magnitude of spread — check your own numbers on the calculator or the live table.