We are excited that we can share the first models in Llama 4 Hed today are available in Azure AI Foundry and Azure Databricks, which allows people to build a more personalized multimodal experience. These meta models are designed to smoothly integrate text and tokens vision into the spine of the unified model. This innovative approach allows developers to use LLAM 4 models in applications that require love for unmarked text, image and video data and new previous development settings.
We are excited that we can share the first models in Llama 4 Hed today are available in Azure AI Foundry and Azure Databricks, which allows people to build a more personalized multimodal experience. These meta models are designed to smoothly integrate text and tokens vision into the spine of the unified model. This innovative approach allows developers to use LLAM 4 models in applications that require love for unmarked text, image and video data and new previous development settings.
Today we bring Meta’s Llama 4 Scout and Maverick Models to Azure AI Foundry as Managed Computer Computers:
- LLAM 4 Scout Models
- LLAMA-4-SCOUT-17B-16E
- LLAMA-4-SCOUT-17B-16E-INTRUCT
- LLAM 4 MAVERICK Models
- LLAM 4-MAVERICK-17B-128E-INTRUCT-FRP8
Azure AI Foundry is designed for more agent-agent boxes, enabubling smooth cooperation between different agents. This opens new boundaries in AI applications, from a comprehensive solution to problems to dynamic task management. Imagine a team of AI agents who are working on analyzing huge data sets, generating creative content and providing more information about more domains in real time. The possibilities are endless.

In order to adapt to a number of cases of use and developer needs, the LLAM 4 models come in smaller and larger options. These models integrate alleviation in each layer of development, from pre -training to training. Using system levels of developers in front of opponents, which emphasizes them to create useful, safe and adaptable experiences for their applications supported by Lama.
LLAM 4 Scout Models: Power and Precision
We share the first models in the Llama 4 herds that allow people to build more personalized multimodal experiences. According to Meta, Llama 4 Scout is one of the best multimodal models in its class and is stronger than Meta’s Llam 3 models while fitting into a single H100 GPU. And Llama4 Scout hits a supported content length of 128k in lift 3 to 10 million tokens leading. This opens up the world of possibilities, including a summary of multiple documents, analysis of extensive users’ activities for personal tasks and thinking about huge codebases.
Targeted use of shacks include summary, personalization and reasoning. Thanks to its long context and efficient size, the LLAM 4 scout 4 shines in tasks that require condensation or analysis of extensive information. It can generate a summary or message from extremely lengthy inputs, adapt its obligations using detailed user security data (without forgetting earlier details) and perform complicated justification across large sets of knowledge.
For example, Scout could analyze all documents in the Enterprise SharePoint library to answer a specific question or read a technical guide for more thousand pages to provide problems on solving problems. It is designed to be a diligent “scout” that huge information and return the most important or answers you need.
LLAMA 4 MAVERICK Models: Innovation on scale
As a generally purple LLM, LLAM contains 4 Maverick 17 billion active parameters, 128 experts and 400 billion total parameters that offer high quality at a lower price compared to LLAMA 3.3 70B. Maverick excels in understanding image and text with 12 languages, allowing the creation of sophisticated AI applications that exceed language barriers. Maverick is ideal for accurate understanding of image and creative Wring, so it is suitable for the use of a general assistant and CAT. For developers it offers state -of -the -art intelligence with high speed, optimized for the best quality and tone.
Targeted use of boxes includes optimized cottage scenarios that require high -quality centers. Meta-deciduous Lama 4 Maverick to be an excellent conversation agent. It is a flagship cottage model of the meta LLAM 4, which thinks that a multimodal, multimodal counterpart for an assistant similar to a cottage.
It’s particularly good for interactive applications:
- Bots support for customer support who need to understand user recording.
- Creative AI partners who can discuss and generate content in different languages.
- Internal enterprise assistants who can help use questions and handling Rich Media in answering questions.
With Maverick, businesses can build high -quality AI assistants who naturally (and politely) conversate with a global user base and use a visual context if necessary.

Architectural Innovation in LLAM 4: Multimodal early fusion and MOE
According to META, two key innovations distinguish Llam 4: native multimodal support with early fusion and sparse mixture of experts (ME) design for efficient and scale.
- Multimodal Transformation Early: LLAM 4 uses access to early merger, treats text, images and video images as the only sequence of chips from the beginning. This allows the model to understand and generate different media together. It excels in tasks involving multiple ways, such as diagram documents or answering questions about the transcription and visuals of video. For businesses, this allows AI assistants to process a complete message (text + Graphics + video snipkets) and provides an integrated summary or answer.
- Top Architecture of Experts (MOE): To achieve good performance without computing interpretations, Prohibitique uses LLAM 4 sparse mixture of architecture of experts (MOE). In principle, this means that the model included numerous sub-blue experts that were acquired as “experts”, with only a small subset active for any given entry token. This design not only increases training efficiency, but also increases inference scalabibility. As a result, the model can handle more questions at the same time by distributing computing loads for various experts, allowing deployment in the production environment without the necessary one GPU. MoE architecture allows Llam 4 to expand its capacity without escalation of costs and offers a significant advantage for enterprises.
Commitment to safety and proven procedures
Meta built Llama 4 with the best procedures in the guide of their developers: protection AI. This included the integration of alleviation into each layer of model development from pre -training to training and tufy mitigation of the system level that protects developers from contradictory attacks. And thanks to these models in Azure AI, the foundry comes with proven safety and safety railings that are hiding from Azure.
We will strengthen developers to create useful, secure and adaptable experiences for their applications supported by Lama. Explore the LLAM 4 models in the Azure AI Foundry Model and Databricks and start building with the latest multimodal, Ai-Powred AI, supported by Meta’s Research and Azure platform.
The availability of META LLAM 4 at Azure AI Foundry and through Azure Databricks offers customers unrivaled flexibility in choosing a platform that best suits their needs. This smooth integration allows users to use advanced AI capacity and strengthen their applications using powerful, secure and customizable solutions. We are excited to see what you build.