Gemini 3.1 Flash Lite Preview vs GLM 5 Turbo

Compare Gemini 3.1 Flash Lite Preview and GLM 5 Turbo on key metrics including price, context length, throughput, and other model features.

AuthorGoogle

Context Length1.0M

Supports Tools

Gemini 3.1 Flash Lite Preview is Google’s high-efficiency model designed for high-throughput, high-volume use cases. It delivers better overall quality than Gemini 2.5 Flash Lite and comes close to Gemini 2.5 Flash performance across core capabilities. Enhancements include audio input/ASR, RAG snippet ranking, translation, data extraction, and code completion. It supports the full range of thinking levels (minimal, low, medium, high) to enable fine-grained cost/performance tuning. Pricing is set at half the cost of Gemini 3 Flash.

Activity

Last 14 days

Prompt

500M

Completion

56M

Total

555M

Startup

Google

Latency (p50)1.01s

Throughput (p50)128.9 tok/s

Pricing

Input$0.13/M tokens

Output$0.75/M tokens

Cached input$0.01/M tokens

Features

Input Modalitiestext, image, audio, file

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model

AuthorZ.ai

Context Length202.8k

Supports Tools

GLM-5 Turbo is a new model by Z.ai built for fast inference and strong performance in agent-based environments like OpenClaw scenarios. It is heavily optimized for real-world agent workflows with long execution chains, offering better decomposition of complex instructions, improved tool use, scheduled and persistent execution, and greater stability throughout extended tasks.

Activity

Last 14 days

Prompt

283M

Completion

Total

286M

Startup

Z.ai

Latency (p50)3.71s

Throughput (p50)24.9 tok/s

Pricing

Input$0.48/M tokens

Output$1.60/M tokens

Cached input$0.10/M tokens

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model