Gemini 2.5 Flash Lite vs Deepseek v3.2

Compare Gemini 2.5 Flash Lite and Deepseek v3.2 on key metrics including price, context length, throughput, and other model features.

AuthorGoogle

Context Length1.0M

Supports Tools

Gemini 2.5 Flash-Lite is a streamlined reasoning model from the Gemini 2.5 family, designed for extremely low latency and cost-effectiveness. It delivers higher throughput, quicker token generation, and enhanced performance on standard benchmarks compared to previous Flash models.

Activity

Last 14 days

Prompt

Completion

11M

Total

Startup

Google

Latency (p50)0.82s

Throughput (p50)114.4 tok/s

Pricing

Input$0.05/M tokens

Output$0.20/M tokens

Cached input-

Features

Input Modalitiestext, image, file, audio

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model

AuthorDeepseek

Context Length163.8k

Supports Tools

DeepSeek-V3.2 is a large language model optimized for high computational efficiency and strong tool-use reasoning. It features DeepSeek Sparse Attention (DSA), a mechanism that lowers training and inference costs while maintaining quality in long-context tasks. A scalable reinforcement learning post-training framework further enhances reasoning, achieving performance comparable to GPT-5 and earning top results on the 2025 IMO and IOI. V3.2 also leverages large-scale agentic task synthesis to improve reasoning in practical tool-use scenarios, boosting its generalization and compliance in interactive environments.

Activity

Last 14 days

Prompt

150M

Completion

Total

153M

Startup

Deepseek

Latency (p50)0.78s

Throughput (p50)45.0 tok/s

Pricing

Input$0.14/M tokens

Output$0.21/M tokens

Cached input$0.01/M tokens

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model