Grok 4.1 Fast Reasoning

Chat Completions

grok-4.1-fast-reasoning

xAI|Created Nov 20, 2025|2M context

Chat Completions

Grok 4.1 Fast Reasoning is xAI's most capable tool-calling model, engineered for production-grade agentic applications with a 2M token context window. Achieving state-of-the-art results on Berkeley Function Calling v4 and leading agentic search benchmarks like Research-Eval Reka (63.9) and FRAMES (87.6), it excels at multi-turn conversations, long-horizon planning, and autonomous task execution. Built through RL training in real-world simulated environments, Grok 4.1 Fast Reasoning delivers exceptional performance on complex enterprise scenarios like customer support and finance while cutting hallucination rates in half compared to its predecessor.

Overview Specifications Activity Performance Uptime Examples

Pricing-50%

Pay-as-you-go rates for this model. More details can be found here.

Input Tokens (1M)

$0.10

Cached Input Tokens (1M)

$0.02

Output Tokens (1M)

$0.25

Capabilities

Input Modalities

TextImage

Output Modalities

Text

Supported Parameters

Available parameters for API requests

Max Completion TokensResponse FormatTemperatureTool ChoiceToolsTop PWeb Search OptionsParallel Tool Calls

Usage Analytics

Token usage of this model on our platform

Throughput

Time-To-First-Token (TTFT)

Code Example

Example code for using this model through our API with Python (OpenAI SDK) or cURL. Replace placeholders with your API key and model ID.

Basic request example. Ensure API key permissions. For more details, see our documentation.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.naga.ac/v1",
    api_key="YOUR_API_KEY",
)

resp = client.chat.completions.create(
    model="grok-4.1-fast-reasoning",
    messages=[
        {"role": "user", "content": "What's 2+2?"}
    ],
    temperature=0.2,
)
print(resp.choices[0].message.content)