Deepseek
Token usage over time
Browse models from Deepseek
DeepSeek Chat v3.1 (Free)
DeepSeek-V3.1 is a 671B-parameter hybrid reasoning model (37B active), supporting both "thinking" and "non-thinking" modes via prompt templates. It extends DeepSeek-V3 with two-phase long-context training (up to 128K tokens) and uses FP8 microscaling for efficient inference. The model excels in tool use, code generation, and reasoning, with performance comparable to DeepSeek-R1 but with faster responses. It supports structured tool calling, code agents, and search agents, making it ideal for research and agentic workflows. Successor to DeepSeek V3-0324, it delivers strong performance across diverse tasks.
DeepSeek Chat v3.1
DeepSeek-V3.1 is a 671B-parameter hybrid reasoning model (37B active), supporting both "thinking" and "non-thinking" modes via prompt templates. It extends DeepSeek-V3 with two-phase long-context training (up to 128K tokens) and uses FP8 microscaling for efficient inference. The model excels in tool use, code generation, and reasoning, with performance comparable to DeepSeek-R1 but with faster responses. It supports structured tool calling, code agents, and search agents, making it ideal for research and agentic workflows. Successor to DeepSeek V3-0324, it delivers strong performance across diverse tasks.
DeepSeek v3.2 Exp
DeepSeek-V3.2-Exp is an experimental large language model from DeepSeek, serving as an intermediate step between V3.1 and future architectures. It features DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism that enhances training and inference efficiency for long-context tasks while preserving high output quality.
DeepSeek Chat
DeepSeek V3 is a 685B-parameter, mixture-of-experts model and the latest iteration of the flagship chat model family from the DeepSeek team. Succeeds the previous DeepSeek V3 model and demonstrates strong performance across a variety of tasks.
DeepSeek Reasoner
DeepSeek-R1-0528 is a lightly upgraded release of DeepSeek R1, utilizing more compute and advanced post-training techniques to push its reasoning and inference capabilities to the level of flagship models like O3 and Gemini 2.5 Pro. Excels in math, programming, and logic leaderboards, with a distilled 8B-parameter variant that rivals much larger models on key benchmarks.
DeepSeek v3.2 Exp (Free)
DeepSeek-V3.2-Exp is an experimental large language model from DeepSeek, serving as an intermediate step between V3.1 and future architectures. It features DeepSeek Sparse Attention (DSA), a fine-grained sparse attention mechanism that enhances training and inference efficiency for long-context tasks while preserving high output quality.
DeepSeek Chat v3.1 Terminus
DeepSeek-V3.1 Terminus is an enhanced version of DeepSeek V3.1 that retains the original model’s capabilities while resolving user-reported issues, such as language consistency and agent functionality. The update further refines the model’s performance in coding and search agent tasks. This large-scale hybrid reasoning model (671B parameters, 37B active) supports both thinking and non-thinking modes. Building on the DeepSeek-V3 foundation, it incorporates a two-phase long-context training approach, allowing for up to 128K tokens, and adopts FP8 microscaling for more efficient inference.
DeepSeek Prover v2
DeepSeek Prover V2 is a 671B parameter model, speculated to be geared towards logic and mathematics. Likely an upgrade from DeepSeek-Prover-V1.5, but released without an official announcement or detailed documentation.
DeepSeek Reasoner (Free)
DeepSeek-R1-0528 is a lightly upgraded release of DeepSeek R1, utilizing more compute and advanced post-training techniques to push its reasoning and inference capabilities to the level of flagship models like O3 and Gemini 2.5 Pro. Excels in math, programming, and logic leaderboards, with a distilled 8B-parameter variant that rivals much larger models on key benchmarks.
DeepSeek Chat (Free)
DeepSeek V3 is a 685B-parameter, mixture-of-experts model and the latest iteration of the flagship chat model family from the DeepSeek team. Succeeds the previous DeepSeek V3 model and demonstrates strong performance across a variety of tasks.