Deepseek V4 Flash vs DeepSeek v3.2 Exp

Compare Deepseek V4 Flash and DeepSeek v3.2 Exp on key metrics including price, context length, throughput, and other model features.

AuthorDeepseek

Context Length1.0M

Supports Tools

DeepSeek V4 Flash is an efficiency-focused Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B active parameters, supporting a 1M-token context window. It is built for fast inference and high-throughput workloads while preserving strong reasoning and coding capabilities. The model features hybrid attention for efficient long-context processing and offers configurable reasoning modes. It is a strong fit for use cases such as coding assistants, chat applications, and agent workflows where responsiveness and cost efficiency matter.

Activity

Last 14 days

Prompt

590M

Completion

37M

Total

627M

Startup

Deepseek

Latency (p50)0.38s

Throughput (p50)92.3 tok/s

Pricing

Input$0.07/M tokens

Output$0.14/M tokens

Cached input$0.01/M tokens

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model

AuthorDeepseek

Context Length163.8k

Supports Tools

DeepSeek-V3.2-Exp is an experimental large language model from DeepSeek, serving as an intermediate step between V3.1 and future architectures. It features DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism that enhances training and inference efficiency for long-context tasks while preserving high output quality.

Activity

Last 14 days

Prompt

131M

Completion

14M

Total

144M

Startup

Deepseek

Latency (p50)3.25s

Throughput (p50)20.5 tok/s

Pricing

Input$0.14/M tokens

Output$0.20/M tokens

Cached input-

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model