Meta Llama

Llama 3.3 70B Instruct (Free)

2M Tokens

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction-tuned generative model with 70B parameters. Optimized for multilingual dialogue, it outperforms many open-source and closed chat models on industry benchmarks. Supported languages include English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

by

Meta Llama

Free

Llama 4 Scout 17B 16E Instruct (Free)

944K Tokens

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model from Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens. Built for high efficiency and local or commercial deployment, it is instruction-tuned for multilingual chat, captioning, and image understanding.

by

Meta Llama

Free

Llama 3.1 8B Instruct

55K Tokens

Meta’s Llama 3.1 8B instruct-tuned model, designed for fast and efficient dialogue. It performs strongly in human evaluations and is ideal for applications requiring a balance of speed and quality.

by

Meta Llama

$0.05/1M input tokens$0.05/1M output tokens

Llama 4 Maverick 17B 128E Instruct

313K Tokens

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward pass (400B total). It supports multilingual text and image input, and produces multilingual text and code output across 12 supported languages. Optimized for vision-language tasks, Maverick is instruction-tuned for assistant-like behavior, image reasoning, and general-purpose multimodal interaction. Features early fusion for native multimodality and a 1 million token context window.

by

Meta Llama

$0.28/1M input tokens$1.10/1M output tokens

Llama 3.2 1B Instruct

18K Tokens

Llama 3.2 1B is a 1-billion-parameter language model focused on efficient natural language tasks, including summarization, dialogue, and multilingual text analysis. Its small size allows for deployment in low-resource environments while maintaining strong performance across eight core languages.

by

Meta Llama

$0.02/1M input tokens$0.02/1M output tokens

Llama 3.2 11B Vision Instruct

1K Tokens

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed for tasks combining visual and textual data. It excels at image captioning and visual question answering, bridging the gap between language generation and visual reasoning. Pre-trained on a massive dataset of image-text pairs, it is ideal for content creation, AI-driven customer service, and research.

by

Meta Llama

$0.10/1M input tokens$0.10/1M output tokens

Llama 4 Scout 17B 16E Instruct

17K Tokens

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model from Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens. Built for high efficiency and local or commercial deployment, it is instruction-tuned for multilingual chat, captioning, and image understanding.

by

Meta Llama

$0.24/1M input tokens$0.96/1M output tokens

Llama 3 70B Instruct

2K Tokens

Meta’s Llama 3 70B instruct-tuned model, optimized for high-quality dialogue and demonstrating strong performance in human evaluations. Suitable for advanced conversational AI tasks.

by

Meta Llama

$0.42/1M input tokens$0.44/1M output tokens

Llama 3.2 3B Instruct

44K Tokens

Llama 3.2 3B is a 3-billion-parameter multilingual model optimized for advanced NLP tasks such as dialogue generation, reasoning, and summarization. It supports eight languages and is trained on 9 trillion tokens, excelling in instruction-following, complex reasoning, and tool use.

by

Meta Llama

$0.02/1M input tokens$0.02/1M output tokens

Llama 3 8B Instruct

57K Tokens

Meta’s Llama 3 8B instruct-tuned model, optimized for high-quality dialogue and demonstrating strong performance in human evaluations. Ideal for efficient conversational AI.

by

Meta Llama

$0.07/1M input tokens$0.10/1M output tokens

Llama 3.1 70B Instruct

17K Tokens

Meta’s Llama 3.1 70B instruct-tuned model, optimized for high-quality dialogue use cases. It demonstrates strong performance in human evaluations and is suitable for a wide range of conversational AI applications.

by

Meta Llama

$0.30/1M input tokens$0.40/1M output tokens

Llama 3.1 405B Instruct

16K Tokens

The highly anticipated 400B class of Llama3 is here, offering a 128k context window and impressive evaluation scores. This 405B instruct-tuned version is optimized for high-quality dialogue and demonstrates strong performance compared to leading closed-source models, including GPT-4o and Claude 3.5 Sonnet.

by

Meta Llama

$1.50/1M input tokens$1.50/1M output tokens

Llama Guard 4 12B

74K Tokens

Llama Guard 4 is a multimodal content safety classifier derived from Llama 4 Scout, fine-tuned for both prompt and response classification. It supports content moderation for English and multiple languages, including mixed text-and-image prompts. The model is aligned with the MLCommons hazards taxonomy and is integrated into the Llama Moderations API for robust safety classification in text and images.

by

Meta Llama

$0.02/1M input tokens$0.02/1M output tokens

Llama 3.3 70B Instruct

537K Tokens

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction-tuned generative model with 70B parameters. Optimized for multilingual dialogue, it outperforms many open-source and closed chat models on industry benchmarks. Supported languages include English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

by

Meta Llama

$0.29/1M input tokens$0.39/1M output tokens

Token usage over time

Browse models from Meta Llama

Llama 3.3 70B Instruct (Free)

Llama 4 Scout 17B 16E Instruct (Free)

Llama 3.1 8B Instruct

Llama 4 Maverick 17B 128E Instruct

Llama 3.2 1B Instruct

Llama 3.2 11B Vision Instruct

Llama 4 Scout 17B 16E Instruct

Llama 3 70B Instruct

Llama 3.2 3B Instruct

Llama 3 8B Instruct

Llama 3.1 70B Instruct

Llama 3.1 405B Instruct

Llama Guard 4 12B

Llama 3.3 70B Instruct