Nano Banana 2 (Gemini 3.1 Flash Image Preview) vs MiniMax M3

Compare Nano Banana 2 (Gemini 3.1 Flash Image Preview) and MiniMax M3 on key metrics including price, context length, throughput, and other model features.

AuthorGoogle

Context Length128k

Supports Tools

Nano Banana 2 (Gemini 3.1 Flash Image) is Google DeepMind’s flagship Flash image model for high-fidelity generation and fast, advanced editing at scale, optimized for price–performance. It follows complex prompts more reliably and adds configurable thinking levels (Minimal vs High/Dynamic) to balance latency and quality. Nano Banana 2 improves in-image text rendering and supports in-image localization (generate/translate text across languages directly in the image), while leveraging stronger world knowledge and web image search for more grounded, realistic outputs. It supports native aspect ratios (including 4:1, 1:4, 8:1, 1:8) and 512px/1K/2K/4K resolutions.

Activity

Last 14 days

Prompt

34M

Completion

192M

Total

226M

Startup

Google

Latency (p50)1.15s

Throughput (p50)135.7 tok/s

Pricing

Input$0.13/M tokens

Output$0.75/M tokens

Cached input-

Features

Input Modalitiestext, image, audio, file

Output Modalitiestext, image

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model

AuthorMinimax

Context Length1M

Supports Tools

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a 1M-token context window, and is suited for long-horizon agentic work, coding, and tool use. It is built on MiniMax Sparse Attention (MSA), which replaces full attention with KV-block selection to cut per-token compute at long context — roughly 1/20 the cost of the previous generation at 1M tokens, with substantially faster prefill and decode while retaining quality across most tasks. Trained as a native multimodal model on interleaved data and tuned for multi-turn, production-like collaboration via an interactive user-simulator framework, the model is oriented toward sustained, multi-step tasks rather than single-turn execution.

Activity

Last 14 days

Prompt

106M

Completion

Total

108M

Startup

Minimax

Latency (p50)2.16s

Throughput (p50)25.4 tok/s

Pricing

Input$0.15/M tokens

Output$0.60/M tokens

Cached input$0.03/M tokens

Features

Input Modalitiestext, image

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model