Grok 4.1 Fast Reasoning

grok-4.1-fast-reasoning
byxAI|Created Nov 20, 2025
Chat Completions

Grok 4.1 Fast Reasoning is xAI's most capable tool-calling model, engineered for production-grade agentic applications with a 2M token context window. Achieving state-of-the-art results on Berkeley Function Calling v4 and leading agentic search benchmarks like Research-Eval Reka (63.9) and FRAMES (87.6), it excels at multi-turn conversations, long-horizon planning, and autonomous task execution. Built through RL training in real-world simulated environments, Grok 4.1 Fast Reasoning delivers exceptional performance on complex enterprise scenarios like customer support and finance while cutting hallucination rates in half compared to its predecessor.

Pricing

Pay-as-you-go rates for this model. More details can be found here.

Input Tokens (1M)

$0.20

Cached Input Tokens (1M)

$0.05

Output Tokens (1M)

$0.50

Capabilities

Input Modalities

Text
Image

Output Modalities

Text

Supported Parameters

Available parameters for API requests

Max Completion Tokens
Parallel Tool Calls
Response Format
Temperature
Tool Choice
Tools
Top P
Web Search Options

Usage Analytics

Token usage of this model on our platform

Uptime

Reliability over the last 7 days

Not enough uptime data to display a chart

Time-To-First-Token (TTFT)

Code Example

Example code for using this model through our API with Python (OpenAI SDK) or cURL. Replace placeholders with your API key and model ID.

Basic request example. Ensure API key permissions. For more details, see our documentation.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.naga.ac/v1",
    api_key="YOUR_API_KEY",
)

resp = client.chat.completions.create(
    model="grok-4.1-fast-reasoning",
    messages=[
        {{"role": "user", "content": "What's 2+2?"}}
    ],
    temperature=0.2,
)
print(resp.choices[0].message.content)