Deepseek v3.2 vs Deepseek V4 Pro

Compare Deepseek v3.2 and Deepseek V4 Pro on key metrics including price, context length, throughput, and other model features.

AuthorDeepseek

Context Length163.8k

Supports Tools

DeepSeek-V3.2 is a large language model optimized for high computational efficiency and strong tool-use reasoning. It features DeepSeek Sparse Attention (DSA), a mechanism that lowers training and inference costs while maintaining quality in long-context tasks. A scalable reinforcement learning post-training framework further enhances reasoning, achieving performance comparable to GPT-5 and earning top results on the 2025 IMO and IOI. V3.2 also leverages large-scale agentic task synthesis to improve reasoning in practical tool-use scenarios, boosting its generalization and compliance in interactive environments.

Activity

Last 14 days

Prompt

226M

Completion

Total

231M

Startup

Deepseek

Latency (p50)1.24s

Throughput (p50)88.1 tok/s

Pricing

Input$0.14/M tokens

Output$0.21/M tokens

Cached input$0.01/M tokens

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model

AuthorDeepseek

Context Length1.0M

Supports Tools

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B active parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding, and long-horizon agent workflows, delivering strong results across knowledge, mathematics, and software engineering benchmarks. Built on the same architecture as DeepSeek V4 Flash, it adds a hybrid attention system for efficient long-context processing and supports multiple reasoning modes to balance speed and depth based on the task. It is well suited for demanding workloads such as full-codebase analysis, multi-step automation, and large-scale information synthesis, where both performance and efficiency are essential.

Activity

Last 14 days

Prompt

348M

Completion

Total

356M

Startup

Deepseek

Latency (p50)0.44s

Throughput (p50)46.3 tok/s

Pricing

Input$0.22/M tokens

Output$0.43/M tokens

Cached input$0.00181/M tokens

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model