Qwen API Pricing

Qwen is Alibaba's family of open-weight language and multimodal models. The Qwen 2.5 and QwQ series consistently top open-source benchmarks for coding and math, with strong multilingual support across Asian languages.

Official site: qwenlm.github.io →

Cheapest

Qwen2.5 7B Instruct

$0.040/1M in

Flagship

Qwen3 Max

$0.780/1M in

Models

47 tracked

All tiers, latest pricing.

All Qwen Models

Model	Tier	Input / 1M	Output / 1M	Context
Qwen2.5 7B Instruct	fast	$0.040	$0.100	33K
Qwen3.5-Flash	fast	$0.065	$0.260	1M
Qwen3 Coder 30B A3B Instruct	fast	$0.070	$0.270	262K
Qwen3 32B	fast	$0.080	$0.280	131K
Qwen3 235B A22B Instruct 2507	fast	$0.090	$0.550	262K
Qwen3 Next 80B A3B Thinking	reasoning	$0.098	$0.780	262K
Qwen3.5-9B	fast	$0.100	$0.150	262K
Qwen3 Next 80B A3B Instruct	fast	$0.100	$1.10	262K
Qwen3 30B A3B Instruct 2507	fast	$0.100	$0.300	262K
Qwen3 VL 32B Instruct	fast	$0.104	$0.416	131K
Qwen3 Coder Next	fast	$0.110	$0.800	262K
Qwen3 VL 8B Thinking	reasoning	$0.117	$1.36	131K
Qwen3 VL 8B Instruct	fast	$0.117	$0.455	262K
Qwen3 8B	fast	$0.117	$0.455	131K
Qwen3 VL 30B A3B Thinking	reasoning	$0.130	$1.56	262K
Qwen3 30B A3B Thinking 2507	reasoning	$0.130	$1.56	82K
Qwen3 30B A3B	fast	$0.130	$0.520	131K
Qwen3.6 35B A3B	fast	$0.140	$1.00	262K
Qwen3.5-35B-A3B	fast	$0.140	$1.00	262K
QwQ 32B	reasoning	$0.150	$0.600	131K
Qwen3 VL 30B A3B Instruct	fast	$0.150	$0.600	262K
Qwen3.6 Flash	fast	$0.188	$1.13	1M
Qwen3.5-27B	fast	$0.195	$1.56	262K
Qwen3 Coder Flash	fast	$0.195	$0.975	1M
Qwen3 VL 235B A22B Instruct	fast	$0.210	$1.90	262K
Qwen3 14B	fast	$0.228	$0.910	131K
Qwen3.5-122B-A10B	fast	$0.260	$2.08	262K
Qwen3.5 Plus 2026-02-15	fast	$0.260	$1.56	1M
Qwen3 VL 235B A22B Thinking	reasoning	$0.260	$2.60	131K
Qwen Plus 0728	fast	$0.260	$0.780	1M
Qwen-Plus	fast	$0.260	$0.780	1M
Qwen3.6 27B	fast	$0.289	$2.40	262K
Qwen3.5 Plus 2026-04-20	fast	$0.300	$1.80	1M
Qwen3 235B A22B Thinking 2507	reasoning	$0.300	$3.00	262K
Qwen3 Coder 480B A35B	fast	$0.300	$1.00	262K
Qwen3.7 Plus	balanced	$0.320	$1.28	1M
Qwen3.6 Plus	balanced	$0.325	$1.95	1M
Qwen 2.5 72B	balanced	$0.360	$0.400	33K
Qwen3.5 397B A17B	fast	$0.390	$2.34	262K
Qwen3 235B A22B	balanced	$0.455	$1.82	131K
Qwen3 Coder Plus	balanced	$0.650	$3.25	1M
Qwen2.5 Coder 32B Instruct	balanced	$0.660	$1.00	33K
Qwen3 Max Thinking	reasoning	$0.780	$3.90	262K
Qwen3 Max	flagship	$0.780	$3.90	262K
Qwen2.5 VL 72B Instruct	flagship	$0.800	$1.00	128K
Qwen3.6 Max Preview	flagship	$1.04	$6.24	262K
Qwen3.7 Max	flagship	$1.48	$4.42	1M

Model Details

Qwen2.5 7B Instruct

$0.040 in

Qwen2.5 7B is the latest series of Qwen large language models.

Qwen3.5-Flash

$0.065 in

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.

Qwen3 Coder 30B A3B Instruct

$0.070 in

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use.

Qwen3 32B

$0.080 in

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue.

Qwen3 235B A22B Instruct 2507

$0.090 in

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass.

Qwen3 Next 80B A3B Thinking

$0.098 in

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default.

Qwen3.5-9B

$0.100 in

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and visual understanding in an efficient 9B-parameter architecture.

Qwen3 Next 80B A3B Instruct

$0.100 in

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces.

Qwen3 30B A3B Instruct 2507

$0.100 in

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qwen, with 3.3B active parameters per inference.

Qwen3 VL 32B Instruct

$0.104 in

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video.

Qwen3 Coder Next

$0.110 in

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows.

Qwen3 VL 8B Thinking

$0.117 in

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences.

Qwen3 VL 8B Instruct

$0.117 in

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video.

Qwen3 8B

$0.117 in

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue.

Qwen3 VL 30B A3B Thinking

$0.130 in

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex tasks.

Qwen3 30B A3B Thinking 2507

$0.130 in

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking.

Qwen3 30B A3B

$0.130 in

Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks.

Qwen3.6 35B A3B

$0.140 in

Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token.

Qwen3.5-35B-A3B

$0.140 in

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency.

QwQ 32B

$0.150 in

QwQ 32B is Qwen's open-weight reasoning model — chain-of-thought capabilities rivaling much larger models at a remarkably low price. One of the best value reasoning models available.

Qwen3 VL 30B A3B Instruct

$0.150 in

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks.

Qwen3.6 Flash

$0.188 in

Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video input with a 1M token context window.

Qwen3.5-27B

$0.195 in

The Qwen3.5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance.

Qwen3 Coder Flash

$0.195 in

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus.

Qwen3 VL 235B A22B Instruct

$0.210 in

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video.

Qwen3 14B

$0.228 in

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue.

Qwen3.5-122B-A10B

$0.260 in

The Qwen3.5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.

Qwen3.5 Plus 2026-02-15

$0.260 in

The Qwen3.5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency.

Qwen3 VL 235B A22B Thinking

$0.260 in

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video.

Qwen Plus 0728

$0.260 in

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.

Qwen-Plus

$0.260 in

Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination.

Qwen3.6 27B

$0.289 in

Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026.

Qwen3.5 Plus 2026-04-20

$0.300 in

Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video input and produces text output, with a 1M token context window.

Qwen3 235B A22B Thinking 2507

$0.300 in

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks.

Qwen3 Coder 480B A35B

$0.300 in

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team.

Qwen3.7 Plus

$0.320 in

Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series.

Qwen3.6 Plus

$0.325 in

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference.

Qwen 2.5 72B

$0.360 in

Qwen 2.5 72B is Alibaba's flagship open-weight model — surprisingly strong on coding and math benchmarks, competitive with GPT-4o class models at a fraction of the cost. A top choice for self-hosted general workloads.

Qwen3.5 397B A17B

$0.390 in

The Qwen3.5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.

Qwen3 235B A22B

$0.455 in

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass.

Qwen3 Coder Plus

$0.650 in

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B.

Qwen2.5 Coder 32B Instruct

$0.660 in

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen).

Qwen3 Max Thinking

$0.780 in

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning.

Qwen3 Max

$0.780 in

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the January 2025 version.

Qwen2.5 VL 72B Instruct

$0.800 in

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images.

Qwen3.6 Max Preview

$1.04 in

Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total parameters.

Qwen3.7 Max

$1.48 in

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series.

Calculate Qwen API Costs

Use the TokenRate calculator to estimate exactly what Qwen models will cost for your workload.

Open Calculator →

Other Providers