Meta Releases Llama 4 Models With Multimodal Capabilities

Meta released its Llama 4 family of open-weight AI models, introducing multimodal capabilities and a new mixture-of-experts architecture.

cueball EditorialFriday, 8 May 2026 4 min read

What Happened

Meta released its Llama 4 family of large language models on April 5, 2025, making two initial models available to developers: Llama 4 Scout and Llama 4 Maverick. The release marks the first time Meta has shipped natively multimodal models under the Llama brand, meaning the models can process both text and images as input.

The Models

Llama 4 Scout is a 17-billion active parameter model using a mixture-of-experts architecture with 16 experts total. Meta states it fits on a single GPU, lowering the hardware barrier for developers running the model locally. Llama 4 Maverick is a larger model, also built on mixture-of-experts design, with 17 billion active parameters drawn from 128 total experts. Meta reported that Maverick performs competitively against GPT-4o and Google's Gemini 2.0 Flash on several standard benchmarks, including coding and reasoning evaluations.

Both models support a context window of up to 1 million tokens, which Meta describes as among the largest available in open-weight models at this scale. A third model, Llama 4 Behemoth, was announced as a larger frontier model still in training. Meta described Behemoth as a teacher model used in the training process for Scout and Maverick, and released early benchmark figures for it without making the weights publicly available.

Architecture and Training

The Llama 4 series represents Meta's shift from dense transformer models to mixture-of-experts architecture, a design that activates only a subset of model parameters for any given input. This approach is intended to improve computational efficiency relative to model capability. Meta trained the models on a dataset it describes as including roughly 200 languages and more than 22 trillion tokens, an increase over the data volumes reported for Llama 3.

Multimodal capability is native to the architecture rather than added through a separate vision encoder bolt-on, according to Meta's technical documentation accompanying the release.

Availability and Licensing

Meta released the Scout and Maverick model weights under its custom Llama 4 Community License. The models are available for download through Meta's website and through third-party platforms including Hugging Face. The license permits commercial use subject to terms that include usage restrictions for developers whose applications exceed 700 million monthly active users, a threshold that applies to a small number of the largest technology platforms.

The models are also accessible through Meta AI, the company's consumer-facing assistant, which is integrated into WhatsApp, Instagram, Messenger, and Facebook.

Background

Meta has released successive generations of Llama models since 2023, positioning the open-weight series as a direct alternative to closed commercial models from OpenAI, Anthropic, and Google. Llama 3, released in 2024, became one of the most widely downloaded open-weight model families according to download figures reported by Hugging Face. Meta Chief Executive Mark Zuckerberg has stated publicly that open-source AI development is a strategic priority for the company, arguing that broad access to capable models benefits Meta's own products and the broader developer ecosystem.

The Llama 4 release follows OpenAI's recent public availability of its o3 and o4-mini reasoning models, continuing a period of accelerated frontier model releases across major AI developers.

What It Means in Practice

Developers can now access and deploy models with native image understanding without licensing closed commercial APIs, using weights that can be run on-premises or in private cloud environments. The one-million-token context window allows processing of long documents, codebases, or extended conversations within a single model call. Enterprise users subject to data privacy requirements have cited on-premises deployment as a reason for preferring open-weight models over hosted commercial services.

Meta has indicated that additional Llama 4 models, including the full Behemoth release, are scheduled to follow as training and evaluation are completed.

Get our editors' take on what it all means. Read the Editor's Blog →