GPT-4.1 Mini (Free) vs GPT OSS 20B

Compare GPT-4.1 Mini (Free) and GPT OSS 20B on key metrics including price, context length, throughput, and other model features.

AuthorOpenAI

Context Length1.0M

Supports Tools

A mid-sized GPT-4.1 model delivering performance competitive with GPT-4o at substantially lower latency and cost. Retains a 1 million token context window and demonstrates strong coding ability and vision understanding, making it suitable for interactive applications with tight performance constraints.

Activity

Last 14 days

Prompt

130M

Completion

45M

Total

174M

Startup

OpenAI

Latency (p50)0.98s

Throughput (p50)63.9 tok/s

Pricing

InputFree

OutputFree

Features

Input Modalitiestext, image, file

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model

AuthorOpenAI

Context Length131.1k

Supports Tools

OpenAI’s 21B-parameter open-weight Mixture-of-Experts (MoE) model, released under the Apache 2.0 license. Features 3.6B active parameters per forward pass, optimized for low-latency inference and deployability on consumer or single-GPU hardware. Trained in OpenAI’s Harmony response format, it supports reasoning level configuration, fine-tuning, and agentic capabilities such as function calling and structured outputs.

Activity

Last 14 days

Prompt

135M

Completion

12M

Total

147M

Startup

OpenAI

Latency (p50)0.11s

Throughput (p50)314.1 tok/s

Pricing

Input$0.02/M tokens

Output$0.10/M tokens

Features

Input Modalitiestext, file

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model