Intel unveils Gaudi 3 accelerator for enterprise generative AI

At Vision 2024, he presented the Intel Gaudi 3 accelerator which, he assured, “offers an average of 50% more inference and an average of 40% more energy efficiency than Nvidia H100”, at a lower cost. In addition, it unveiled a set of new scalable open systems, next-generation products and strategic collaborations to accelerate the adoption of GenAI. Marcelo Bertolami, LATAM Regional Partners and Tech Team General Manager, spoke about the development opportunities of this technology in Latin America.

Leonardo Barbieri

04/10/2024

At the “Intel Vision 2024”, Intel introduced the Intel Gaudi 3 accelerator to bring performance, openness and choice to enterprise generative AI (GenAI), and unveiled a set of new scalable open systems, next-generation products and strategic collaborations to accelerate GenAI adoption.

“Thanks to silicon, innovation is advancing at an unprecedented pace and every company is rapidly becoming an AI company,” said Intel CEO Pat Gelsinger. «Intel is bringing AI to all parts of the enterprise, from the PC to the data center to the Edge. Our latest platforms Gaudi, Xeon and Core Ultra are offering a cohesive set of flexible solutions tailored to meet the changing needs of our customers and capitalize on the immense opportunities ahead.”

On this point, Marcelo Bertolami highlighted the three laws mentioned by the CEO of Intel: economic, the Cloud is much more expensive than the Edge; the physical law, since we cannot change the speed of light, so due to a question of latency, if fast responses are needed, the processing has to be done locally; and the law of the land, companies want to have their data in their country. “By 2026, Machine Learning and AI will be at the Edge in at least half of the implementations,” he said.

The Intel Gaudi 3 AI Accelerator will power AI systems with up to tens of thousands of accelerators connected over the common Ethernet standard. According to the company, it promises 4 times more AI computing power for BF16 and 1.5 times more memory bandwidth than its predecessor. “The accelerator will offer a significant leap in AI training and inference for global companies seeking to deploy GenAI at scale,” he says.

“COMPARED TO NVIDIA H100, INTEL GAUDI 3 IS EXPECTED TO DELIVER 70% FASTER AVERAGE TRAINING TIME FOR LLAMA2 MODELS WITH 13B PARAMETERS. IN ADDITION, IT IS EXPECTED TO EXCEED H100 BY AN AVERAGE OF 50% IN INFERENCE AND 40% IN ENERGY EFFICIENCY FOR THE LARGE LANGUAGE MODELS (LLM) LLAMA 7B, 70B AND FALCON 180B.”

Intel Gaudi 3 provides community-based open software and industry-standard Ethernet networking. Additionally, it allows enterprises to flexibly scale from a single node to clusters, superclusters, and megaclusters with thousands of nodes, supporting inference, fine-tuning, and training at the largest scale.

Intel Gaudi 3 will be available to OEMs, including Dell Technologies, HPE, Lenovo and Supermicro, in the second quarter of 2024.

Open and scalable AI systems

The company also presented its strategy for open and scalable AI systems, which includes hardware, software, frameworks and tools. “Intel’s approach enables a broad, open ecosystem of AI players to deliver solutions that meet enterprise-specific GenAI needs. This includes equipment manufacturers, database vendors, system integrators, software and service providers, and more. It also allows companies to use the solutions and ecosystem partners they already know and trust,” he said.

“AI EVERYWARE: INTEL WORKS TO MAKE AI SAFE, MORE ECONOMIC, WITH LESS CONSUMPTION, AND IN AN OPEN ECOSYSTEM” – MARCELO BERTOLAMI

He also said that there is a very big space for the channel with PC AI. “Imagine that there are 18 months until the end of Windows 10 Support (October 14, 2025): “There is a clear opportunity to enter and modernize those fleets of PCs,” he said.

In that sense, he recalled that through the Intel Partner Alliance, the company provides training to its different types of channels, hardware, software, services and for developers.

“The development in Latam with AI is impressive, remember that it is software; and not only with Intel. CIOs today are seeing how they use AI to gain competitiveness. AI is not going to beat a human alone, a human with AI is going to beat a company without AI,” he concluded.

Intel shared broad momentum with enterprise customers across industries to deploy Intel Gaudi Accelerator solutions for innovative new generative AI applications:

NAVER. NAVER has confirmed Intel Gaudi’s foundational capability in executing compute operations for large-scale Transformer models with exceptional performance per watt, to develop the most powerful LLM model for deploying advanced AI services globally, from the cloud to the device.
Bosch. Explore new opportunities for smart manufacturing, including foundational models, generating synthetic data sets of manufacturing anomalies to provide robust, evenly distributed training sets for, for example, automated optical inspection.
IBM. It uses 5th generation Intel® Xeon® processors for its watsonx.data data warehouse and works closely with Intel to validate the watsonx platform for Intel Gaudi accelerators.
Ola/Krutrim. It pre-trains and fine-tunes its foundational India model with generative capabilities in 10 languages, producing industry-leading performance/cost (1.5-2x better) vs. market solutions. Krutrim is training a large foundational model on a Gaudi 2 cluster.
Advent International/NielsenIQ. Deploys GenAI within its Discover platform, including training domain-specific Large Language Models (LLMs) on the world’s largest database of consumer purchasing behavior, improving its customer services .
Seekr. A leader in trusted AI, run production workloads on Intel Gaudi 2, Intel Max Series GPUs, and Intel Xeon processors in the Intel Developer Cloud for LLM development and production deployment support.
IFF. Lead the next wave of sustainable consumer care solutions by establishing an integrated, AI-generated digital biology workflow to improve enzyme design and digital twin technology for fermentation process optimization.
CtrlS Group. Working with Intel to bring a 128-node Intel Developer Cloud Gaudi cluster to customers based in India and expanding CtrlS cloud services for India with additional Gaudi clusters.
Bharti Airtel. Leveraging the power of Intel’s cutting-edge technology, Airtel plans to leverage its rich telecom data to enhance its AI capabilities and turbo charge its customers’ experiences. The deployments will be in line with Airtel’s commitment to stay at the forefront of technological innovation and help drive new revenue streams in a rapidly evolving digital landscape.
Landing AI. Domain-specific, large-scale vision model for cell segmentation and cancer detection.
Roboflow. It runs production workloads from YOLOv5, YOLOv8, CLIP, SAM, and ViT models for its end-to-end computer vision platform.
Infosys. Strategic collaboration to use Intel technologies solutions on Infosys Topaz, a set of AI-based services, solutions and platforms that accelerate business value using GenAI.

Intel has also announced collaborations with Google Cloud, Thales, and Cohesity to leverage Intel’s confidential computing capabilities in its cloud instances. This includes Intel Trust Domain Extensions (Intel TDX), Intel Software Guard Extensions (Intel SGX), and Intel Backup Service. Customers can run their AI models and algorithms in a trusted execution environment (TEE) and can leverage Intel Trust Services to provide independent verification for their C3 virtual machine instances.

Open enterprise AI platform

In collaboration with SAP, Oracle and other industry leaders, Intel has announced its intention to create an open platform for enterprise AI. “The industry-wide effort aims to develop open, multi-vendor GenAI systems that deliver best-in-class ease of deployment, performance and value, enabled by recovery augmented generation (RAG),” he announced. RAG incorporates structured and unstructured data from trusted sources outside of a model, improving the accuracy and reliability of GenAI while preserving the security of proprietary data.

As initial steps in this effort, Intel will release reference implementations for GenAI pipelines, publish a technical conceptual framework, and continue to add infrastructure capacity in the Intel Developer Cloud for ecosystem development and validation of RAG and future pipelines. Intel encourages greater ecosystem participation to join forces in this open effort to facilitate enterprise adoption and business results.

Roadmap

In addition to the Intel Gaudi 3 accelerator, Intel provided updates on its next-generation products and services across all segments of enterprise AI.

Intel Xeon 6

Intel Xeon processors offer a performance-efficient solution for running today’s GenAI solutions, including RAG, that produce business-specific results using proprietary data. Intel introduced the new brand for its next-generation processors for data centers, cloud and edge: Intel Xeon 6. “Launched in the second quarter of 2024, Performance-core (P-core) will offer exceptional efficiency and higher AI performance,” he highlighted.

Intel Xeon 6 processors with E-cores (formerly Sierra Forest):
2.4 times more performance per watt and 2.7 times more rack density compared to 2nd generation Intel Xeon processors.
Customers can replace aging systems at a ratio of nearly 3 to 1, dramatically reducing energy consumption and helping meet sustainability goals.
Intel Xeon 6 processors with P-cores (previously called Granite Rapids):
It includes software support for the MXFP4 data format, which improves inference performance by up to 2.5 times over BF16, with the ability to run 70 trillion-parameter Llama-2 models.

Clients, Edge and connectivity

Intel also announced customer push and updates to its roadmap for Edge and connectivity:

Intel Core Ultra processors are driving new productivity, security and content creation capabilities, providing strong motivation for businesses to renew their PC fleets.

“WE EXPECT TO MARKET 40 MILLION PCs WITH ARTIFICIAL INTELLIGENCE IN 2024, WITH OVER 230 DESIGNS, FROM ULTRA-THIN PCS TO PORTABLE GAMING DEVICES.”

The next generation of the Intel Core Ultra client processor family (codenamed Lunar Lake), launching in 2024, will have more than 100 tera peak operations per second (TOPS) and more than 46 TOPS of drive neural processing (NPU) for the next generation of AI PCs.
Intel has announced new cutting-edge silicon Edge products from the Intel Core Ultra, Intel Core, Intel Atom and Intel Arc graphics processing unit (GPU) families, targeting key markets such as retail, industrial manufacturing and healthcare. All new additions to the Intel Edge AI portfolio will be available this quarter and will be supported by Intel Tiber Edge Platform this year.

Transforming AI systems, Intel introduces the AI NIC (network interface card), based on the Ultra Ethernet Consortium’s open standard. This expands the company’s network connectivity portfolio of Intel® Ethernet Network Adapters and Intel Infrastructure Processing Units (IPUs). Starting in 2026, the AI NIC will be available in Ethernet card or chiplet format, and will provide optimized Ethernet-based network connectivity for training and inference in the largest AI clusters.
Intel Tiber

Intel introduced the Intel Tiber portfolio of enterprise solutions to streamline the deployment of enterprise software and services, including for GenAI.

“A unified experience makes it easier for customers and enterprise developers to find solutions that fit their needs, accelerate innovation, and unlock value without compromising security, compliance, or performance,” he reported.