TokenRate

Qwen API Pricing

Qwen is Alibaba's family of open-weight language and multimodal models. The Qwen 2.5 and QwQ series consistently top open-source benchmarks for coding and math, with strong multilingual support across Asian languages.

Official site: qwenlm.github.io

Cheapest

Qwen2.5 7B Instruct

$0.040/1M in

Flagship

Qwen2.5 VL 72B Instruct

$0.250/1M in

Models

47 tracked

All tiers, latest pricing.

All Qwen Models

ModelTierInput / 1MOutput / 1MContext
Qwen2.5 7B Instructfast$0.040$0.100131K
Qwen3 30B A3B Instruct 2507fast$0.048$0.193131K
Qwen3 8Bfast$0.050$0.400131K
Qwen3.5-Flashfast$0.065$0.2601M
Qwen3 Coder 30B A3B Instructfast$0.070$0.270160K
Qwen3 VL 8B Instructfast$0.080$0.500256K
Qwen3 30B A3B Thinking 2507reasoning$0.080$0.400131K
Qwen3 32Bfast$0.080$0.280131K
Qwen3 Next 80B A3B Instructfast$0.090$1.10262K
Qwen3 235B A22B Instruct 2507fast$0.090$0.100262K
Qwen3 Next 80B A3B Thinkingreasoning$0.098$0.780262K
Qwen3.5-9Bfast$0.100$0.150262K
Qwen3 235B A22B Thinking 2507reasoning$0.100$0.100262K
Qwen3 14Bfast$0.100$0.240132K
Qwen3 VL 32B Instructfast$0.104$0.416262K
Qwen3 Coder Nextfast$0.110$0.800262K
Qwen3 VL 8B Thinkingreasoning$0.117$1.36256K
Qwen3 30B A3Bfast$0.120$0.500131K
Qwen3 VL 30B A3B Thinkingreasoning$0.130$1.56131K
Qwen3 VL 30B A3B Instructfast$0.130$0.520262K
Qwen3.6 35B A3Bfast$0.140$1.00262K
Qwen3.5-35B-A3Bfast$0.140$1.00262K
QwQ 32Breasoning$0.150$0.600131K
Qwen3.6 Flashfast$0.188$1.131M
Qwen3.5-27Bfast$0.195$1.56262K
Qwen3 Coder Flashfast$0.195$0.9751M
Qwen3 VL 235B A22B Instructfast$0.200$0.880262K
Qwen3 Coder 480B A35Bfast$0.220$1.801M
Qwen2.5 VL 72B Instructflagship$0.250$0.750131K
Qwen3.5-122B-A10Bfast$0.260$2.08262K
Qwen3.5 Plus 2026-02-15fast$0.260$1.561M
Qwen3 VL 235B A22B Thinkingreasoning$0.260$2.60131K
Qwen Plus 0728 (thinking)reasoning$0.260$0.7801M
Qwen-Plusfast$0.260$0.7801M
Qwen3.6 27Bfast$0.289$2.40262K
Qwen3.5 Plus 2026-04-20fast$0.300$1.801M
Qwen3.6 Plusbalanced$0.325$1.951M
Qwen 2.5 72Bbalanced$0.360$0.400131K
Qwen3.5 397B A17Bfast$0.390$2.34262K
Qwen3.7 Plusbalanced$0.400$1.601M
Qwen3 235B A22Bbalanced$0.455$1.82131K
Qwen3 Coder Plusbalanced$0.650$3.251M
Qwen2.5 Coder 32B Instructbalanced$0.660$1.00128K
Qwen3 Max Thinkingreasoning$0.780$3.90262K
Qwen3 Maxflagship$0.780$3.90262K
Qwen3.6 Max Previewflagship$1.04$6.24262K
Qwen3.7 Maxflagship$1.25$3.751M

Model Details

Qwen2.5 7B Instruct

$0.040 in

Qwen2. 5 7B is the latest series of Qwen large language models.

Qwen3 30B A3B Instruct 2507

$0.048 in

Qwen3-30B-A3B-Instruct-2507 is a 30. 5B-parameter mixture-of-experts language model from Qwen, with 3.

Qwen3 8B

$0.050 in

Qwen3-8B is a dense 8. 2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue.

Qwen3.5-Flash

$0.065 in

The Qwen3. 5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.

Qwen3 Coder 30B A3B Instruct

$0.070 in

Qwen3-Coder-30B-A3B-Instruct is a 30. 5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use.

Qwen3 VL 8B Instruct

$0.080 in

Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...

Qwen3 30B A3B Thinking 2507

$0.080 in

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking. The model is designed specifically for “thinking mode,” where internal reasoning traces are separated...

Qwen3 32B

$0.080 in

Qwen3-32B is a dense 32. 8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue.

Qwen3 Next 80B A3B Instruct

$0.090 in

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code generation, knowledge QA, and multilingual...

Qwen3 235B A22B Instruct 2507

$0.090 in

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following,...

Qwen3 Next 80B A3B Thinking

$0.098 in

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard multi-step problems; math proofs, code synthesis/debugging, logic, and agentic...

Qwen3.5-9B

$0.100 in

Qwen3.5-9B is Qwen's a fast, low-cost model tuned for high-throughput tasks like classification, extraction, and simple chat. It costs $0.100 per million input tokens with a 262K-token context window and native image understanding.

Qwen3 235B A22B Thinking 2507

$0.100 in

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. It activates 22B of its 235B parameters per forward pass and natively supports up to 262,144...

Qwen3 14B

$0.100 in

Qwen3-14B is a dense 14. 8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue.

Qwen3 VL 32B Instruct

$0.104 in

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...

Qwen3 Coder Next

$0.110 in

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total parameters and only 3B activated per...

Qwen3 VL 8B Thinking

$0.117 in

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences. It integrates enhanced multimodal alignment and...

Qwen3 30B A3B

$0.120 in

Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique...

Qwen3 VL 30B A3B Thinking

$0.130 in

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex tasks.

Qwen3 VL 30B A3B Instruct

$0.130 in

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks.

Qwen3.6 35B A3B

$0.140 in

Qwen3. 6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token.

Qwen3.5-35B-A3B

$0.140 in

The Qwen3. 5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency.

QwQ 32B

$0.150 in

QwQ 32B is Qwen's open-weight reasoning model — chain-of-thought capabilities rivaling much larger models at a remarkably low price. One of the best value reasoning models available.

Qwen3.6 Flash

$0.188 in

Qwen3. 6 Flash is a fast, efficient language model from Alibaba's Qwen 3.

Qwen3.5-27B

$0.195 in

The Qwen3. 5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance.

Qwen3 Coder Flash

$0.195 in

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful coding agent model specializing in autonomous programming via tool calling...

Qwen3 VL 235B A22B Instruct

$0.200 in

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language use (VQA, document parsing, chart/table...

Qwen3 Coder 480B A35B

$0.220 in

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over...

Qwen2.5 VL 72B Instruct

$0.250 in

Qwen2. 5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects.

Qwen3.5-122B-A10B

$0.260 in

The Qwen3. 5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.

Qwen3.5 Plus 2026-02-15

$0.260 in

The Qwen3. 5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency.

Qwen3 VL 235B A22B Thinking

$0.260 in

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is optimized for multimodal reasoning in STEM and math....

Qwen Plus 0728 (thinking)

$0.260 in

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.

Qwen-Plus

$0.260 in

Qwen-Plus, based on the Qwen2. 5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination.

Qwen3.6 27B

$0.289 in

Qwen3. 6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026.

Qwen3.5 Plus 2026-04-20

$0.300 in

Qwen3. 5 Plus (April 2026) is a large-scale multimodal language model from Alibaba.

Qwen3.6 Plus

$0.325 in

Qwen 3. 6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference.

Qwen 2.5 72B

$0.360 in

Qwen 2.5 72B is Alibaba's flagship open-weight model — surprisingly strong on coding and math benchmarks, competitive with GPT-4o class models at a fraction of the cost. A top choice for self-hosted general workloads.

Qwen3.5 397B A17B

$0.390 in

The Qwen3. 5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.

Qwen3.7 Plus

$0.400 in

Qwen3.7 Plus is Qwen's a balanced model that trades a little peak capability for much lower cost and faster responses. It costs $0.400 per million input tokens with a 1M-token context window and native image understanding.

Qwen3 235B A22B

$0.455 in

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. It supports seamless switching between a "thinking" mode for complex reasoning, math, and...

Qwen3 Coder Plus

$0.650 in

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous programming via tool calling and...

Qwen2.5 Coder 32B Instruct

$0.660 in

Qwen2. 5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen).

Qwen3 Max Thinking

$0.780 in

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By significantly scaling model capacity and reinforcement learning compute, it...

Qwen3 Max

$0.780 in

Qwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the January 2025 version. It...

Qwen3.6 Max Preview

$1.04 in

Qwen3. 6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total parameters.

Qwen3.7 Max

$1.25 in

Qwen3.7 Max is Qwen's a flagship-tier model built for the most demanding reasoning, coding, and long-form tasks. It costs $1.25 per million input tokens with a 1M-token context window.

Calculate Qwen API Costs

Use the TokenRate calculator to estimate exactly what Qwen models will cost for your workload.

Open Calculator →

Other Providers