Qwen3 VL 235B A22B Instruct

qwen3-vl-235b-a22b-instruct
byQwen|Created Sep 24, 2025
Chat Completions

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that combines strong text generation with advanced visual understanding for images and video. Designed for general vision-language tasks like VQA, document parsing, chart and table extraction, and multilingual OCR, the model emphasizes robust perception, spatial (2D/3D) understanding, and long-form visual comprehension, with competitive results on public benchmarks. Qwen3-VL also supports agentic interaction and tool use, following complex instructions in multi-image dialogues, aligning text to video timelines, operating GUIs for automation, and enabling visual coding workflows such as turning sketches into code or debugging UIs. Its strong text-only capabilities match Qwen3 language models, making it suitable for document AI, OCR, UI/software assistance, spatial reasoning, and vision-language agent research.

Code Example

Example code for using this model through our API with Python (OpenAI SDK) or cURL. Replace placeholders with your API key and model ID.

Basic request example. Ensure API key permissions. For more details, see our documentation.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.naga.ac/v1",
    api_key="YOUR_API_KEY",
)

resp = client.chat.completions.create(
    model="qwen3-vl-235b-a22b-instruct",
    messages=[
        {{"role": "user", "content": "What's 2+2?"}}
    ],
    temperature=0.2,
)
print(resp.choices[0].message.content)