Nemotron 3 Super (free) vs Qwen3.7 Plus

Compare Nemotron 3 Super (free) and Qwen3.7 Plus on key metrics including price, context length, throughput, and other model features.

AuthorNvidia

Context Length262.1k

Supports Tools

NVIDIA Nemotron 3 Super is an open hybrid MoE model with 120B parameters, using only 12B active parameters to achieve high computational efficiency and strong accuracy in complex multi-agent scenarios. Based on a hybrid Mamba-Transformer Mixture-of-Experts architecture with multi-token prediction (MTP), it offers more than 50% faster token generation than leading open models. The model includes a 1M-token context window, enabling long-term agent consistency, cross-document reasoning, and multi-step task planning. Latent MoE makes it possible to engage 4 experts at the inference cost of just one, enhancing both intelligence and generalization. Reinforcement learning across more than 10 environments provides top-tier benchmark performance, including AIME 2025, TerminalBench, and SWE-Bench Verified. Released fully open with weights, datasets, and recipes under the NVIDIA Open License, Nemotron 3 Super supports simple customization and secure deployment in any environment — from local workstations to the cloud.

Activity

Last 14 days

Prompt

343M

Completion

20M

Total

363M

Startup

Nvidia

Latency (p50)5.55s

Throughput (p50)37.0 tok/s

Pricing

InputFree

OutputFree

Cached input-

Features

Input Modalitiestext

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model

AuthorQwen

Context Length1M

Supports Tools

Qwen3.7-Plus is a cost-effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, building on the series' text capabilities with a comprehensive upgrade to its vision-language abilities while retaining full-stack, agent-level intelligence for coding, tool use, and productivity workflows. Its distinguishing trait is multi-modal interactive hybrid agent capability: it can perceive real-world scenes, read screens and interact with GUIs, generate code from visual references, and perform end-to-end navigation within mobile apps.

Activity

Last 14 days

Prompt

160M

Completion

Total

168M

Startup

Qwen

Latency (p50)1.64s

Throughput (p50)53.5 tok/s

Pricing

Input$0.20/M tokens

Output$0.80/M tokens

Cached input$0.04/M tokens

Features

Input Modalitiestext, image

Output Modalitiestext

Supported EndpointsChat Completions

Vision

Supports Tools

Go to model