Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B-parameter sparse Mixture-of-Experts with 13B active parameters per token using 4-of-256 expert routing. It delivers exceptional performance in creative writing, storytelling, role-play, and conversational scenarios—surpassing the capabilities of typical reasoning models, particularly in real-time voice assistance. Furthermore, this model introduces advanced agentic capabilities; it is specifically optimized to navigate agent frameworks like OpenCode, Cline, and Kilo Code, while seamlessly managing intricate toolchains and extensive, constraint-heavy prompts. The architecture natively supports massive context windows of up to 512k tokens, though the current Preview API is served at a 128k context using 8-bit quantization for efficient deployment. Trinity-Large-Preview is a testament to Arcee’s efficiency-first design, providing a production-grade frontier model with open weights and permissive licensing tailored for both real-world deployment and rigorous experimentation.
Free