Meta Llama

Token usage over time

Browse models from Meta Llama

14 models

Llama 3.3 70B Instruct (Free)

2M Tokens

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction-tuned generative model with 70B parameters. Optimized for multilingual dialogue, it outperforms many open-source and closed chat models on industry benchmarks. Supported languages include English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

byMeta Llama
Free

Llama 4 Scout 17B 16E Instruct (Free)

944K Tokens

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model from Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens. Built for high efficiency and local or commercial deployment, it is instruction-tuned for multilingual chat, captioning, and image understanding.

byMeta Llama
Free

Llama 3.1 8B Instruct

55K Tokens

Meta’s Llama 3.1 8B instruct-tuned model, designed for fast and efficient dialogue. It performs strongly in human evaluations and is ideal for applications requiring a balance of speed and quality.

byMeta Llama
$0.05/1M input tokens$0.05/1M output tokens

Llama 4 Maverick 17B 128E Instruct

313K Tokens

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward pass (400B total). It supports multilingual text and image input, and produces multilingual text and code output across 12 supported languages. Optimized for vision-language tasks, Maverick is instruction-tuned for assistant-like behavior, image reasoning, and general-purpose multimodal interaction. Features early fusion for native multimodality and a 1 million token context window.

byMeta Llama
$0.28/1M input tokens$1.10/1M output tokens

Llama 3.2 1B Instruct

18K Tokens

Llama 3.2 1B is a 1-billion-parameter language model focused on efficient natural language tasks, including summarization, dialogue, and multilingual text analysis. Its small size allows for deployment in low-resource environments while maintaining strong performance across eight core languages.

byMeta Llama
$0.02/1M input tokens$0.02/1M output tokens

Llama 3.2 11B Vision Instruct

1K Tokens

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed for tasks combining visual and textual data. It excels at image captioning and visual question answering, bridging the gap between language generation and visual reasoning. Pre-trained on a massive dataset of image-text pairs, it is ideal for content creation, AI-driven customer service, and research.

byMeta Llama
$0.10/1M input tokens$0.10/1M output tokens

Llama 4 Scout 17B 16E Instruct

17K Tokens

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model from Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens. Built for high efficiency and local or commercial deployment, it is instruction-tuned for multilingual chat, captioning, and image understanding.

byMeta Llama
$0.24/1M input tokens$0.96/1M output tokens

Llama 3 70B Instruct

2K Tokens

Meta’s Llama 3 70B instruct-tuned model, optimized for high-quality dialogue and demonstrating strong performance in human evaluations. Suitable for advanced conversational AI tasks.

byMeta Llama
$0.42/1M input tokens$0.44/1M output tokens

Llama 3.2 3B Instruct

44K Tokens

Llama 3.2 3B is a 3-billion-parameter multilingual model optimized for advanced NLP tasks such as dialogue generation, reasoning, and summarization. It supports eight languages and is trained on 9 trillion tokens, excelling in instruction-following, complex reasoning, and tool use.

byMeta Llama
$0.02/1M input tokens$0.02/1M output tokens

Llama 3 8B Instruct

57K Tokens

Meta’s Llama 3 8B instruct-tuned model, optimized for high-quality dialogue and demonstrating strong performance in human evaluations. Ideal for efficient conversational AI.

byMeta Llama
$0.07/1M input tokens$0.10/1M output tokens

Llama 3.1 70B Instruct

17K Tokens

Meta’s Llama 3.1 70B instruct-tuned model, optimized for high-quality dialogue use cases. It demonstrates strong performance in human evaluations and is suitable for a wide range of conversational AI applications.

byMeta Llama
$0.30/1M input tokens$0.40/1M output tokens

Llama 3.1 405B Instruct

16K Tokens

The highly anticipated 400B class of Llama3 is here, offering a 128k context window and impressive evaluation scores. This 405B instruct-tuned version is optimized for high-quality dialogue and demonstrates strong performance compared to leading closed-source models, including GPT-4o and Claude 3.5 Sonnet.

byMeta Llama
$1.50/1M input tokens$1.50/1M output tokens

Llama Guard 4 12B

74K Tokens

Llama Guard 4 is a multimodal content safety classifier derived from Llama 4 Scout, fine-tuned for both prompt and response classification. It supports content moderation for English and multiple languages, including mixed text-and-image prompts. The model is aligned with the MLCommons hazards taxonomy and is integrated into the Llama Moderations API for robust safety classification in text and images.

byMeta Llama
$0.02/1M input tokens$0.02/1M output tokens

Llama 3.3 70B Instruct

537K Tokens

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction-tuned generative model with 70B parameters. Optimized for multilingual dialogue, it outperforms many open-source and closed chat models on industry benchmarks. Supported languages include English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

byMeta Llama
$0.29/1M input tokens$0.39/1M output tokens