Deepseek V4 Pro vs MiniMax M3

Compare Deepseek V4 Pro and MiniMax M3 on key metrics including price, context length, throughput, and other model features.

AuthorDeepseek

Context Length1.0M

Supports Tools

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B active parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding, and long-horizon agent workflows, delivering strong results across knowledge, mathematics, and software engineering benchmarks. Built on the same architecture as DeepSeek V4 Flash, it adds a hybrid attention system for efficient long-context processing and supports multiple reasoning modes to balance speed and depth based on the task. It is well suited for demanding workloads such as full-codebase analysis, multi-step automation, and large-scale information synthesis, where both performance and efficiency are essential.

Activity

Last 14 days

Prompt

Completion

10M

Total

Startup

Deepseek

Latency (p50)0.75s

Throughput (p50)47.4 tok/s

Pricing

Input$0.22/M tokens

Output$0.43/M tokens

Cached input$0.00181/M tokens

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model

AuthorMinimax

Context Length1M

Supports Tools

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding, and tool use. It is built on MiniMax Sparse Attention (MSA), which replaces full attention with KV-block selection to cut per-token compute at long context — roughly 1/20 the cost of the previous generation at 1M tokens, with substantially faster prefill and decode while retaining quality across most tasks. Trained as a native multimodal model on interleaved data and tuned for multi-turn, production-like collaboration via an interactive user-simulator framework, the model is oriented toward sustained, multi-step tasks rather than single-turn execution.

Activity

Last 14 days

Prompt

128M

Completion

Total

130M

Startup

Minimax

Latency (p50)2.74s

Throughput (p50)45.7 tok/s

Pricing

Input$0.15/M tokens

Output$0.60/M tokens

Cached input$0.03/M tokens

Features

Input Modalitiestext, image

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model