Trinity Large Preview (free)

Chat Completions

trinity-large-preview:free

ArceeAI|Created Feb 13, 2026|131k context

Chat Completions

Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B-parameter sparse Mixture-of-Experts with 13B active parameters per token using 4-of-256 expert routing. It delivers exceptional performance in creative writing, storytelling, role-play, and conversational scenarios—surpassing the capabilities of typical reasoning models, particularly in real-time voice assistance. Furthermore, this model introduces advanced agentic capabilities; it is specifically optimized to navigate agent frameworks like OpenCode, Cline, and Kilo Code, while seamlessly managing intricate toolchains and extensive, constraint-heavy prompts. The architecture natively supports massive context windows of up to 512k tokens, though the current Preview API is served at a 128k context using 8-bit quantization for efficient deployment. Trinity-Large-Preview is a testament to Arcee’s efficiency-first design, providing a production-grade frontier model with open weights and permissive licensing tailored for both real-world deployment and rigorous experimentation.

Overview Specifications Activity Performance Uptime Examples

Pricing

Pay-as-you-go rates for this model. More details can be found here.

Free

Capabilities

Input Modalities

Text

Output Modalities

Text

Supported Parameters

Available parameters for API requests

Max Completion TokensTemperatureTop PResponse FormatToolsTool Choice

Usage Analytics

Token usage of this model on our platform

Throughput

Time-To-First-Token (TTFT)

Code Example

Example code for using this model through our API with Python (OpenAI SDK) or cURL. Replace placeholders with your API key and model ID.

Basic request example. Ensure API key permissions. For more details, see our documentation.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.naga.ac/v1",
    api_key="YOUR_API_KEY",
)

resp = client.chat.completions.create(
    model="trinity-large-preview:free",
    messages=[
        {"role": "user", "content": "What's 2+2?"}
    ],
    temperature=0.2,
)
print(resp.choices[0].message.content)