Deepseek V4 Pro vs Deepseek V4 Flash

Compare Deepseek V4 Pro and Deepseek V4 Flash on key metrics including price, context length, throughput, and other model features.

AuthorDeepseek

Context Length1.0M

Supports Tools

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B active parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding, and long-horizon agent workflows, delivering strong results across knowledge, mathematics, and software engineering benchmarks. Built on the same architecture as DeepSeek V4 Flash, it adds a hybrid attention system for efficient long-context processing and supports multiple reasoning modes to balance speed and depth based on the task. It is well suited for demanding workloads such as full-codebase analysis, multi-step automation, and large-scale information synthesis, where both performance and efficiency are essential.

Activity

Last 14 days

Prompt

Completion

16M

Total

Startup

Deepseek

Latency (p50)0.46s

Throughput (p50)49.9 tok/s

Pricing

Input$0.22/M tokens

Output$0.43/M tokens

Cached input$0.00181/M tokens

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model

AuthorDeepseek

Context Length1.0M

Supports Tools

DeepSeek V4 Flash is an efficiency-focused Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B active parameters, supporting a 1M-token context window. It is built for fast inference and high-throughput workloads while preserving strong reasoning and coding capabilities. The model features hybrid attention for efficient long-context processing and offers configurable reasoning modes. It is a strong fit for use cases such as coding assistants, chat applications, and agent workflows where responsiveness and cost efficiency matter.

Activity

Last 14 days

Prompt

Completion

27M

Total

Startup

Deepseek

Latency (p50)0.39s

Throughput (p50)77.4 tok/s

Pricing

Input$0.07/M tokens

Output$0.14/M tokens

Cached input$0.0014/M tokens

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model