GLM 4.5 Air (free) vs GLM 5.1

Compare GLM 4.5 Air (free) and GLM 5.1 on key metrics including price, context length, throughput, and other model features.

AuthorZ.ai

Context Length131.1k

Supports Tools

GLM-4.5-Air is the lightweight version of our newest flagship model family, designed specifically for agent-focused applications. Like GLM-4.5, it uses a Mixture-of-Experts (MoE) architecture, but with a smaller parameter footprint. GLM-4.5-Air also supports hybrid inference modes, including a "thinking mode" for deeper reasoning and tool usage, and a "non-thinking mode" for real-time interactions.

Activity

Last 14 days

Prompt

507M

Completion

31M

Total

538M

Startup

Z.ai

Latency (p50)35.89s

Throughput (p50)18.2 tok/s

Pricing

InputFree

OutputFree

Cached input-

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model

AuthorZ.ai

Context Length202.8k

Supports Tools

GLM-5.1 represents a major advance in coding ability, with especially notable improvements in tackling long-horizon tasks. Unlike earlier models designed for interactions lasting only minutes, GLM-5.1 can operate independently and continuously on a single task for over 8 hours, autonomously planning, executing, and refining its work throughout the process, ultimately producing complete, engineering-grade results.

Activity

Last 14 days

Prompt

116M

Completion

Total

123M

Startup

Z.ai

Latency (p50)4.10s

Throughput (p50)28.0 tok/s

Pricing

Input$0.63/M tokens

Output$1.98/M tokens

Cached input$0.13/M tokens

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model