Qwen

Token usage over time

Browse models from Qwen

19 models

Qwen3 235B A22B Thinking 2507

57M Tokens

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. Activates 22B of its 235B parameters per forward pass and natively supports up to 262,144 tokens of context. This "thinking-only" variant enhances structured logical reasoning, mathematics, science, and long-form generation, and is instruction-tuned for step-by-step reasoning, tool use, agentic workflows, and multilingual tasks.

by

$0.07/1M input tokens$0.30/1M output tokens

Qwen3 Max

7M Tokens

Qwen3-Max, the updated model in the Qwen3 series, brings significant advances in reasoning, instruction following, multilingual support, and knowledge coverage compared to the January 2025 version. It offers better accuracy in math, coding, logic, and science, handles complex instructions in Chinese and English more reliably, reduces hallucinations, and gives higher-quality responses in open Q&A and conversations. Supporting 100+ languages, it improves translation and commonsense reasoning, and is optimized for retrieval-augmented generation (RAG) and tool use, though it lacks a specific “thinking” mode.

by

$0.60/1M input tokens$3.00/1M output tokens

Qwen3 Next 80B A3B Thinking

27M Tokens

Qwen3-Next-80B-A3B-Thinking is a reasoning-focused model that generates structured “thinking” traces by default. Suited for complex multi-step tasks like math proofs, code synthesis, logic, and agentic planning. Compared to earlier Qwen3 models, it’s more stable with long reasoning chains and scales efficiently during inference. Designed for agent frameworks, function calling, retrieval-based workflows, and benchmarks needing step-by-step solutions, it supports detailed completions and faster output through multi-token prediction. Runs only in thinking mode.

by

$0.07/1M input tokens$0.70/1M output tokens

Qwen Image Edit 2509

192K Tokens

Qwen-Image-Edit-2509 is the latest iteration of the Qwen-Image-Edit model, released in September. It introduces multi-image editing capabilities by building on the original architecture and further training with image concatenation, supporting combinations like “person + person,” “person + product,” and “person + scene,” with optimal performance for 1 to 3 images. For single-image editing, Qwen-Image-Edit-2509 delivers improved consistency, particularly in person editing (better facial identity preservation and support for various portrait styles), product editing (enhanced product identity retention), and text editing (support for modifying fonts, colors, and materials in addition to content). The model also natively supports ControlNet features, such as depth maps, edge maps, and keypoint maps.

by

~$0.04/image

Qwen3 VL 235B A22B Thinking

99K Tokens

Qwen3-VL-235B-A22B Thinking is a multimodal model that combines advanced text generation with visual understanding for images and video, specifically optimized for multimodal reasoning in STEM and math. It delivers robust perception, strong spatial (2D/3D) understanding, and long-form visual comprehension, showing competitive performance in public benchmarks for both perception and reasoning. Beyond analysis, Qwen3-VL supports agentic interaction, tool use, following complex instructions in multi-image dialogues, aligning text with video timelines, and automating GUI operations. The model also enables visual coding workflows, such as turning sketches into code and assisting with UI debugging, while maintaining strong text-only capabilities on par with Qwen3 language models. This makes it ideal for use cases like document AI, multilingual OCR, UI/software help, spatial reasoning, and vision-language agent research.

by

$0.35/1M input tokens$4.20/1M output tokens

Qwen3 VL 235B A22B Instruct

228K Tokens

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that combines strong text generation with advanced visual understanding for images and video. Designed for general vision-language tasks like VQA, document parsing, chart and table extraction, and multilingual OCR, the model emphasizes robust perception, spatial (2D/3D) understanding, and long-form visual comprehension, with competitive results on public benchmarks. Qwen3-VL also supports agentic interaction and tool use, following complex instructions in multi-image dialogues, aligning text to video timelines, operating GUIs for automation, and enabling visual coding workflows such as turning sketches into code or debugging UIs. Its strong text-only capabilities match Qwen3 language models, making it suitable for document AI, OCR, UI/software assistance, spatial reasoning, and vision-language agent research.

by

$0.35/1M input tokens$1.40/1M output tokens

Qwen Image

351K Tokens

Qwen-Image is a foundation image generation model from the Qwen team, excelling at high-fidelity text rendering, complex text integration (including English and Chinese), and diverse artistic styles. It supports advanced editing features such as style transfer, object manipulation, and human pose editing, and is suitable for both image generation and understanding tasks.

by

~$0.01/image

Qwen3 Coder

3M Tokens

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. Optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning over repositories. Features 480 billion total parameters, with 35 billion active per forward pass (8 out of 160 experts), and supports variable pricing based on context length.

by

$0.15/1M input tokens$0.60/1M output tokens

Qwen3 235B A22B

788K Tokens

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. Supports seamless switching between a "thinking" mode for complex reasoning, math, and code tasks, and a "non-thinking" mode for general conversational efficiency. Demonstrates strong reasoning ability, multilingual support (100+ languages and dialects), advanced instruction-following, and agent tool-calling capabilities.

by

$0.05/1M input tokens$0.30/1M output tokens

Qwen3 Coder Plus

7M Tokens

Qwen3 Coder Plus is Alibaba's proprietary version of the open-source Qwen3 Coder 480B A35B, designed as a powerful coding agent that excels in autonomous programming through tool use and environment interaction, blending strong coding skills with broad general-purpose capabilities.

by

$0.50/1M input tokens$0.90/1M output tokens

Qwen3 Next 80B A3B Instruct

1M Tokens

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model from the Qwen3-Next series, designed for quick and stable responses without “thinking” traces. It handles complex tasks like reasoning, code generation, knowledge Q&A, and multilingual applications with strong alignment and formatting. Compared to earlier Qwen3 instruct versions, it offers higher throughput and stability, even with long inputs or multi-turn conversations. Ideal for RAG, tool use, and agentic workflows, it delivers consistent and reliable answers with efficient parameter use and fast inference.

by

$0.07/1M input tokens$0.70/1M output tokens

Qwen 2.5 72B Instruct

8K Tokens

Qwen2.5 72B is the latest in the Qwen large language model series, offering significant improvements in knowledge, coding, and mathematics. It features specialized expert models, improved instruction following, long-text generation (over 8K tokens), structured data understanding, and robust multilingual support for over 29 languages. The model is optimized for resilience to diverse system prompts and enhanced role-play implementation.

by

$0.07/1M input tokens$0.20/1M output tokens

Qwen3 235B A22B 2507

377K Tokens

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. Optimized for general-purpose text generation, including instruction following, logical reasoning, math, code, and tool usage. Supports a native 262K context length and delivers significant gains in knowledge coverage, long-context reasoning, and coding benchmarks.

by

$0.07/1M input tokens$0.42/1M output tokens

Qwen3 30B A3B

89K Tokens

Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique ability to switch seamlessly between a thinking mode for complex reasoning and a non-thinking mode for efficient dialogue ensures versatile, high-quality performance. The Qwen3-30B-A3B variant includes 30.5 billion parameters (3.3 billion activated), 48 layers, 128 experts (8 activated per task), and supports up to 131K token contexts with YaRN, setting a new standard among open-source models.

by

$0.05/1M input tokens$0.15/1M output tokens

Qwen Turbo

161K Tokens

Qwen-Turbo is a 1M context model based on Qwen2.5, designed for fast speed and low cost. It is suitable for simple tasks and applications where efficiency and affordability are prioritized over deep reasoning.

by

$0.02/1M input tokens$0.10/1M output tokens

Qwen3 32B

71K Tokens

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. Supports seamless switching between a "thinking" mode for tasks like math, coding, and logical inference, and a "non-thinking" mode for faster, general-purpose conversation. Demonstrates strong performance in instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects.

by

$0.05/1M input tokens$0.15/1M output tokens

Qwen Max

66K Tokens

Qwen-Max is a large-scale Mixture-of-Experts (MoE) model from Qwen, based on Qwen2.5, and provides the best inference performance among Qwen models, especially for complex multi-step tasks. Pretrained on over 20 trillion tokens and further post-trained with curated Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), it is designed for high-accuracy, high-recall applications. The exact parameter count is undisclosed.

by

$0.80/1M input tokens$3.20/1M output tokens

Qwen3 14B

37K Tokens

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. Supports seamless switching between a "thinking" mode for tasks like math, programming, and logical inference, and a "non-thinking" mode for general-purpose conversation. Fine-tuned for instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects.

by

$0.04/1M input tokens$0.12/1M output tokens

QwQ 32B

37K Tokens

QwQ-32B is the medium-sized reasoning model in the Qwen series, designed for advanced thinking and reasoning tasks. It achieves competitive performance against state-of-the-art models like DeepSeek-R1 and o1-mini, and is particularly strong on hard problems requiring deep analytical skills.

by

$0.14/1M input tokens$0.20/1M output tokens

Back to all models