Meta Llama
Token usage over time
Browse models from Meta Llama
Llama 3.3 70B Instruct (Free)
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction-tuned generative model with 70B parameters. Optimized for multilingual dialogue, it outperforms many open-source and closed chat models on industry benchmarks. Supported languages include English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Llama 4 Scout 17B 16E Instruct (Free)
Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model from Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens. Built for high efficiency and local or commercial deployment, it is instruction-tuned for multilingual chat, captioning, and image understanding.
Llama 3.1 8B Instruct
Meta’s Llama 3.1 8B instruct-tuned model, designed for fast and efficient dialogue. It performs strongly in human evaluations and is ideal for applications requiring a balance of speed and quality.
Llama 4 Maverick 17B 128E Instruct
Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward pass (400B total). It supports multilingual text and image input, and produces multilingual text and code output across 12 supported languages. Optimized for vision-language tasks, Maverick is instruction-tuned for assistant-like behavior, image reasoning, and general-purpose multimodal interaction. Features early fusion for native multimodality and a 1 million token context window.
Llama 3.2 1B Instruct
Llama 3.2 1B is a 1-billion-parameter language model focused on efficient natural language tasks, including summarization, dialogue, and multilingual text analysis. Its small size allows for deployment in low-resource environments while maintaining strong performance across eight core languages.
Llama 3.2 11B Vision Instruct
Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed for tasks combining visual and textual data. It excels at image captioning and visual question answering, bridging the gap between language generation and visual reasoning. Pre-trained on a massive dataset of image-text pairs, it is ideal for content creation, AI-driven customer service, and research.
Llama 4 Scout 17B 16E Instruct
Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model from Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens. Built for high efficiency and local or commercial deployment, it is instruction-tuned for multilingual chat, captioning, and image understanding.
Llama 3 70B Instruct
Meta’s Llama 3 70B instruct-tuned model, optimized for high-quality dialogue and demonstrating strong performance in human evaluations. Suitable for advanced conversational AI tasks.
Llama 3.2 3B Instruct
Llama 3.2 3B is a 3-billion-parameter multilingual model optimized for advanced NLP tasks such as dialogue generation, reasoning, and summarization. It supports eight languages and is trained on 9 trillion tokens, excelling in instruction-following, complex reasoning, and tool use.
Llama 3 8B Instruct
Meta’s Llama 3 8B instruct-tuned model, optimized for high-quality dialogue and demonstrating strong performance in human evaluations. Ideal for efficient conversational AI.
Llama 3.1 70B Instruct
Meta’s Llama 3.1 70B instruct-tuned model, optimized for high-quality dialogue use cases. It demonstrates strong performance in human evaluations and is suitable for a wide range of conversational AI applications.
Llama 3.1 405B Instruct
The highly anticipated 400B class of Llama3 is here, offering a 128k context window and impressive evaluation scores. This 405B instruct-tuned version is optimized for high-quality dialogue and demonstrates strong performance compared to leading closed-source models, including GPT-4o and Claude 3.5 Sonnet.
Llama Guard 4 12B
Llama Guard 4 is a multimodal content safety classifier derived from Llama 4 Scout, fine-tuned for both prompt and response classification. It supports content moderation for English and multiple languages, including mixed text-and-image prompts. The model is aligned with the MLCommons hazards taxonomy and is integrated into the Llama Moderations API for robust safety classification in text and images.
Llama 3.3 70B Instruct
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction-tuned generative model with 70B parameters. Optimized for multilingual dialogue, it outperforms many open-source and closed chat models on industry benchmarks. Supported languages include English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.