Red Hat has announced the release of Red Hat AI 3, a new version of its enterprise artificial intelligence platform designed to simplify the deployment and management of AI workloads in production environments. The platform integrates recent developments from Red Hat AI Inference Server, Red Hat Enterprise Linux AI (RHEL AI), and Red Hat OpenShift AI.
Joe Fernandes, vice president and general manager of the AI Business Unit at Red Hat, stated, “With Red Hat AI 3, we are providing an enterprise-grade, open source platform that minimizes these hurdles. By bringing new capabilities like distributed inference with llm-d and a foundation for agentic AI, we are enabling IT teams to more confidently operationalize next-generation AI, on their own terms, across any infrastructure.”
As organizations move from experimental stages to operationalizing AI systems at scale, they face challenges such as data privacy concerns and cost management. A study by the Massachusetts Institute of Technology NANDA project indicates that most organizations have not yet seen significant financial returns from large investments in enterprise AI.
Red Hat AI 3 aims to address these challenges by offering a consistent experience for CIOs and IT leaders seeking to maximize their investment in computing resources. The platform is designed to scale and distribute workloads across various environments—including data centers, public clouds, sovereign infrastructures, and edge locations—while supporting different hardware accelerators.
The focus of Red Hat’s update is on streamlining the “inference” phase of enterprise AI—the stage where trained models generate outputs—by building upon open-source community projects such as vLLM and llm-d. With OpenShift AI 3.0’s general availability of llm-d, organizations can deploy large language models (LLMs) natively on Kubernetes clusters using advanced scheduling features and cross-platform hardware support.
Dan McNamara, senior vice president and general manager of Server and Enterprise AI at AMD said: “As Red Hat brings distributed AI inference into production, AMD is proud to provide the high-performance foundation behind it. Together, we’ve integrated the efficiency of AMD EPYC processors, the scalability of AMD Instinct GPUs, and the openness of the AMD ROCm software stack to help enterprises move beyond experimentation and operationalize next-generation AI — turning performance and scalability into real business impact across on-prem, cloud, and edge environments.”
Key features introduced include Model as a Service (MaaS) capabilities for centralized model serving; an “AI hub” for managing foundational assets; a GenAI studio for rapid prototyping; validated models such as gpt-oss from OpenAI; Whisper for speech-to-text; and Voxtral Mini for voice applications.
To facilitate next-generation “agentic” workflows—where autonomous agents manage complex tasks—the update introduces a Unified API layer based on Llama Stack compatible with industry standards. It also supports emerging protocols like Model Context Protocol (MCP) that streamline integration between models and external tools.
Mariano Greco, CEO of ARSAT commented: “As a provider of connectivity infrastructure for Argentina… By building our agentic AI platform on Red Hat OpenShift AI, we went from identifying the need to live production in just 45 days. Red Hat OpenShift AI has not only helped us improve our service and reduce the time engineers spend on support issues but also freed them up to focus on innovation and new developments.”
The toolkit now includes Python libraries built on InstructLab functionality for model customization using open source technologies such as Docling for data processing. Additional features support synthetic data generation and fine-tuning large language models.
Rick Villars from IDC noted: “2026 will mark an inflection point as enterprises shift from starting their AI pivot to demanding more measurable and repeatable business outcomes from investments… Companies that succeed in becoming AI-fueled businesses will be those who establish a unified platform to orchestrate these ever more sophisticated workloads in hybrid cloud environments…”
Ujval Kapasi from NVIDIA added: “Scalable, high-performance inference is key to the next wave of generative and agentic AI. With built-in support for accelerated inference with open source NVIDIA Dynamo and NIXL technologies, Red Hat AI 3 provides a unified platform that empowers teams to move swiftly from experimentation to running advanced AI workloads and agents at scale.”
Red Hat continues its strategy of providing open-source solutions aimed at making enterprise-scale artificial intelligence accessible across different industries.



