Gemini 3 Flash Preview is a high-speed, cost-effective reasoning model built for agent-driven workflows, multi-turn conversation, and coding support. Offering near-Pro level performance in both reasoning and tool use, it stands out by delivering significantly lower latency than larger Gemini versions—making it ideal for interactive development, long-running agent loops, and collaborative programming. Compared to Gemini 2.5 Flash, it features notable improvements in reasoning ability, multimodal comprehension, and overall reliability. The model supports a 1M token context window and handles multimodal inputs—text, images, audio, video, and PDFs—with text-based output. Features like configurable reasoning levels, structured outputs, tool integration, and automatic context caching make it a strong choice for users seeking powerful agentic capabilities without the high cost or lag of more extensive models.
from openai import OpenAI
client = OpenAI(
base_url="https://api.naga.ac/v1",
api_key="YOUR_API_KEY",
)
resp = client.chat.completions.create(
model="gemini-3-flash-preview",
messages=[
{"role": "user", "content": "What's 2+2?"}
],
temperature=0.2,
)
print(resp.choices[0].message.content)