Qwen API Pricing
Qwen is Alibaba's family of open-weight language and multimodal models. The Qwen 2.5 and QwQ series consistently top open-source benchmarks for coding and math, with strong multilingual support across Asian languages.
Models
47 tracked
All tiers, latest pricing.
All Qwen Models
| Model | Tier | Input / 1M | Output / 1M | Context |
|---|---|---|---|---|
| Qwen2.5 7B Instruct | fast | $0.040 | $0.100 | 131K |
| Qwen3 30B A3B Instruct 2507 | fast | $0.048 | $0.193 | 131K |
| Qwen3 8B | fast | $0.050 | $0.400 | 131K |
| Qwen3.5-Flash | fast | $0.065 | $0.260 | 1M |
| Qwen3 Coder 30B A3B Instruct | fast | $0.070 | $0.270 | 160K |
| Qwen3 VL 8B Instruct | fast | $0.080 | $0.500 | 256K |
| Qwen3 30B A3B Thinking 2507 | reasoning | $0.080 | $0.400 | 131K |
| Qwen3 32B | fast | $0.080 | $0.280 | 131K |
| Qwen3 Next 80B A3B Instruct | fast | $0.090 | $1.10 | 262K |
| Qwen3 235B A22B Instruct 2507 | fast | $0.090 | $0.100 | 262K |
| Qwen3 Next 80B A3B Thinking | reasoning | $0.098 | $0.780 | 262K |
| Qwen3.5-9B | fast | $0.100 | $0.150 | 262K |
| Qwen3 235B A22B Thinking 2507 | reasoning | $0.100 | $0.100 | 262K |
| Qwen3 14B | fast | $0.100 | $0.240 | 132K |
| Qwen3 VL 32B Instruct | fast | $0.104 | $0.416 | 262K |
| Qwen3 Coder Next | fast | $0.110 | $0.800 | 262K |
| Qwen3 VL 8B Thinking | reasoning | $0.117 | $1.36 | 256K |
| Qwen3 30B A3B | fast | $0.120 | $0.500 | 131K |
| Qwen3 VL 30B A3B Thinking | reasoning | $0.130 | $1.56 | 131K |
| Qwen3 VL 30B A3B Instruct | fast | $0.130 | $0.520 | 262K |
| Qwen3.6 35B A3B | fast | $0.140 | $1.00 | 262K |
| Qwen3.5-35B-A3B | fast | $0.140 | $1.00 | 262K |
| QwQ 32B | reasoning | $0.150 | $0.600 | 131K |
| Qwen3.6 Flash | fast | $0.188 | $1.13 | 1M |
| Qwen3.5-27B | fast | $0.195 | $1.56 | 262K |
| Qwen3 Coder Flash | fast | $0.195 | $0.975 | 1M |
| Qwen3 VL 235B A22B Instruct | fast | $0.200 | $0.880 | 262K |
| Qwen3 Coder 480B A35B | fast | $0.220 | $1.80 | 1M |
| Qwen2.5 VL 72B Instruct | flagship | $0.250 | $0.750 | 131K |
| Qwen3.5-122B-A10B | fast | $0.260 | $2.08 | 262K |
| Qwen3.5 Plus 2026-02-15 | fast | $0.260 | $1.56 | 1M |
| Qwen3 VL 235B A22B Thinking | reasoning | $0.260 | $2.60 | 131K |
| Qwen Plus 0728 (thinking) | reasoning | $0.260 | $0.780 | 1M |
| Qwen-Plus | fast | $0.260 | $0.780 | 1M |
| Qwen3.6 27B | fast | $0.289 | $2.40 | 262K |
| Qwen3.5 Plus 2026-04-20 | fast | $0.300 | $1.80 | 1M |
| Qwen3.6 Plus | balanced | $0.325 | $1.95 | 1M |
| Qwen 2.5 72B | balanced | $0.360 | $0.400 | 131K |
| Qwen3.5 397B A17B | fast | $0.390 | $2.34 | 262K |
| Qwen3.7 Plus | balanced | $0.400 | $1.60 | 1M |
| Qwen3 235B A22B | balanced | $0.455 | $1.82 | 131K |
| Qwen3 Coder Plus | balanced | $0.650 | $3.25 | 1M |
| Qwen2.5 Coder 32B Instruct | balanced | $0.660 | $1.00 | 128K |
| Qwen3 Max Thinking | reasoning | $0.780 | $3.90 | 262K |
| Qwen3 Max | flagship | $0.780 | $3.90 | 262K |
| Qwen3.6 Max Preview | flagship | $1.04 | $6.24 | 262K |
| Qwen3.7 Max | flagship | $1.25 | $3.75 | 1M |
Model Details
Qwen2.5 7B Instruct
$0.040 inQwen2. 5 7B is the latest series of Qwen large language models.
Qwen3 30B A3B Instruct 2507
$0.048 inQwen3-30B-A3B-Instruct-2507 is a 30. 5B-parameter mixture-of-experts language model from Qwen, with 3.
Qwen3 8B
$0.050 inQwen3-8B is a dense 8. 2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue.
Qwen3.5-Flash
$0.065 inThe Qwen3. 5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.
Qwen3 Coder 30B A3B Instruct
$0.070 inQwen3-Coder-30B-A3B-Instruct is a 30. 5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use.
Qwen3 VL 8B Instruct
$0.080 inQwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon...
Qwen3 30B A3B Thinking 2507
$0.080 inQwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring extended multi-step thinking. The model is designed specifically for “thinking mode,” where internal reasoning traces are separated...
Qwen3 32B
$0.080 inQwen3-32B is a dense 32. 8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue.
Qwen3 Next 80B A3B Instruct
$0.090 inQwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable responses without “thinking” traces. It targets complex tasks across reasoning, code generation, knowledge QA, and multilingual...
Qwen3 235B A22B Instruct 2507
$0.090 inQwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following,...
Qwen3 Next 80B A3B Thinking
$0.098 inQwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” traces by default. It’s designed for hard multi-step problems; math proofs, code synthesis/debugging, logic, and agentic...
Qwen3.5-9B
$0.100 inQwen3.5-9B is Qwen's a fast, low-cost model tuned for high-throughput tasks like classification, extraction, and simple chat. It costs $0.100 per million input tokens with a 262K-token context window and native image understanding.
Qwen3 235B A22B Thinking 2507
$0.100 inQwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. It activates 22B of its 235B parameters per forward pass and natively supports up to 262,144...
Qwen3 14B
$0.100 inQwen3-14B is a dense 14. 8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue.
Qwen3 VL 32B Instruct
$0.104 inQwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and reasoning across text, images, and video. With 32 billion parameters, it combines deep visual perception with advanced text...
Qwen3 Coder Next
$0.110 inQwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total parameters and only 3B activated per...
Qwen3 VL 8B Thinking
$0.117 inQwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences. It integrates enhanced multimodal alignment and...
Qwen3 30B A3B
$0.120 inQwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique...
Qwen3 VL 30B A3B Thinking
$0.130 inQwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Thinking variant enhances reasoning in STEM, math, and complex tasks.
Qwen3 VL 30B A3B Instruct
$0.130 inQwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks.
Qwen3.6 35B A3B
$0.140 inQwen3. 6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion active parameters per token.
Qwen3.5-35B-A3B
$0.140 inThe Qwen3. 5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency.
QwQ 32B
$0.150 inQwQ 32B is Qwen's open-weight reasoning model — chain-of-thought capabilities rivaling much larger models at a remarkably low price. One of the best value reasoning models available.
Qwen3.6 Flash
$0.188 inQwen3. 6 Flash is a fast, efficient language model from Alibaba's Qwen 3.
Qwen3.5-27B
$0.195 inThe Qwen3. 5 27B native vision-language Dense model incorporates a linear attention mechanism, delivering fast response times while balancing inference speed and performance.
Qwen3 Coder Flash
$0.195 inQwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful coding agent model specializing in autonomous programming via tool calling...
Qwen3 VL 235B A22B Instruct
$0.200 inQwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understanding across images and video. The Instruct model targets general vision-language use (VQA, document parsing, chart/table...
Qwen3 Coder 480B A35B
$0.220 inQwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over...
Qwen2.5 VL 72B Instruct
$0.250 inQwen2. 5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects.
Qwen3.5-122B-A10B
$0.260 inThe Qwen3. 5 122B-A10B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.
Qwen3.5 Plus 2026-02-15
$0.260 inThe Qwen3. 5 native vision-language series Plus models are built on a hybrid architecture that integrates linear attention mechanisms with sparse mixture-of-experts models, achieving higher inference efficiency.
Qwen3 VL 235B A22B Thinking
$0.260 inQwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across images and video. The Thinking model is optimized for multimodal reasoning in STEM and math....
Qwen Plus 0728 (thinking)
$0.260 inQwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced performance, speed, and cost combination.
Qwen-Plus
$0.260 inQwen-Plus, based on the Qwen2. 5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination.
Qwen3.6 27B
$0.289 inQwen3. 6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026.
Qwen3.5 Plus 2026-04-20
$0.300 inQwen3. 5 Plus (April 2026) is a large-scale multimodal language model from Alibaba.
Qwen3.6 Plus
$0.325 inQwen 3. 6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference.
Qwen 2.5 72B
$0.360 inQwen 2.5 72B is Alibaba's flagship open-weight model — surprisingly strong on coding and math benchmarks, competitive with GPT-4o class models at a fraction of the cost. A top choice for self-hosted general workloads.
Qwen3.5 397B A17B
$0.390 inThe Qwen3. 5 series 397B-A17B native vision-language model is built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency.
Qwen3.7 Plus
$0.400 inQwen3.7 Plus is Qwen's a balanced model that trades a little peak capability for much lower cost and faster responses. It costs $0.400 per million input tokens with a 1M-token context window and native image understanding.
Qwen3 235B A22B
$0.455 inQwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. It supports seamless switching between a "thinking" mode for complex reasoning, math, and...
Qwen3 Coder Plus
$0.650 inQwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agent model specializing in autonomous programming via tool calling and...
Qwen2.5 Coder 32B Instruct
$0.660 inQwen2. 5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen).
Qwen3 Max Thinking
$0.780 inQwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that require deep, multi-step reasoning. By significantly scaling model capacity and reinforcement learning compute, it...
Qwen3 Max
$0.780 inQwen3-Max is an updated release built on the Qwen3 series, offering major improvements in reasoning, instruction following, multilingual support, and long-tail knowledge coverage compared to the January 2025 version. It...
Qwen3.6 Max Preview
$1.04 inQwen3. 6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture with approximately 1 trillion total parameters.
Qwen3.7 Max
$1.25 inQwen3.7 Max is Qwen's a flagship-tier model built for the most demanding reasoning, coding, and long-form tasks. It costs $1.25 per million input tokens with a 1M-token context window.
Calculate Qwen API Costs
Use the TokenRate calculator to estimate exactly what Qwen models will cost for your workload.
Open Calculator →Other Providers