GPT-4.1 Nano

gpt-4.1-nano-2025-04-14
by openai|Created May 25, 2025

The fastest and most cost-effective model in the GPT-4.1 series, designed for tasks demanding low latency such as classification and autocompletion. Maintains a 1 million token context window and delivers exceptional performance at a small size, outperforming even some larger models on key benchmarks.

Throughput

Time-To-First-Token (TTFT)