16.8 C
London
Thursday, September 19, 2024

Accelerating Enterprise AI Innovation: NVIDIA and Oracle Combine Forces to Unlock Breakthrough Data Processing and AI Analysis

Introduction

Enterprises are constantly seeking powerful compute capabilities to support their AI workloads and accelerate data processing. The efficiency gained from this can lead to better returns on investments in AI training and fine-tuning, as well as improved user experiences for AI inference. In this article, we will explore the exciting developments in AI computing infrastructure and how they are helping organizations unlock the full potential of AI.

Oracle CloudWorld Announcement

At the recent Oracle CloudWorld conference, Oracle Cloud Infrastructure (OCI) announced the first zettascale OCI Supercluster, accelerated by the NVIDIA Blackwell platform. This powerful computing infrastructure is designed to help enterprises train and deploy next-generation AI models using more than 100,000 of NVIDIA’s latest-generation GPUs.

The OCI Supercluster allows customers to choose from a wide range of NVIDIA GPUs and deploy them anywhere: on premises, public cloud, and sovereign cloud. Set for availability in the first half of next year, the Blackwell-based systems can scale up to 131,072 Blackwell GPUs with NVIDIA ConnectX-7 NICs for RoCEv2 or NVIDIA Quantum-2 InfiniBand networking to deliver an astounding 2.4 zettaflops of peak AI compute to the cloud.

GB200 NVL72 and NVIDIA HGX H200

At the show, Oracle also previewed NVIDIA GB200 NVL72 liquid-cooled bare-metal instances to help power generative AI applications. These instances are capable of large-scale training with Quantum-2 InfiniBand and real-time inference of trillion-parameter models within the expanded 72-GPU NVIDIA NVLink domain, which can act as a single, massive GPU.

This year, OCI will offer NVIDIA HGX H200, connecting eight NVIDIA H200 Tensor Core GPUs in a single bare-metal instance via NVLink and NVLink Switch, and scaling to 65,536 H200 GPUs with NVIDIA ConnectX-7 NICs over RoCEv2 cluster networking. The instance is available to order for customers looking to deliver real-time inference at scale and accelerate their training workloads.

L40S GPU-Accelerated Instances

OCI also announced general availability of NVIDIA L40S GPU-accelerated instances for midrange AI workloads, NVIDIA Omniverse, and visualization. These instances offer a cost-effective and highly scalable solution for a wide range of AI applications.

Edge Computing with NVIDIA GPUs

For single-node to multi-rack solutions, Oracle’s edge offerings provide scalable AI at the edge accelerated by NVIDIA GPUs, even in disconnected and remote locations. For example, smaller-scale deployments with Oracle’s Roving Edge Device v2 will now support up to three NVIDIA L4 Tensor Core GPUs.

Partnership with Reka

Companies are using NVIDIA-powered OCI Superclusters to drive AI innovation. Foundation model startup Reka, for example, is using the clusters to develop advanced multimodal AI models to develop enterprise agents.

Sovereign AI Worldwide

NVIDIA and Oracle are collaborating to deliver sovereign AI infrastructure worldwide, helping address the data residency needs of governments and enterprises. Brazil-based startup Wide Labs trained and deployed Amazonia IA, one of the first large language models for Brazilian Portuguese, using NVIDIA H100 Tensor Core GPUs and the NVIDIA NeMo framework in OCI’s Brazilian data centers.

Enterprise-Ready AI with NVIDIA and Oracle

Enterprises can accelerate task automation on OCI by deploying NVIDIA software such as NIM microservices and cuOpt with OCI’s scalable cloud solutions. These solutions enable enterprises to quickly adopt generative AI and build agentic workflows for complex tasks like code generation and route optimization.

Conclusion

In conclusion, NVIDIA and Oracle are working together to bring AI and accelerated data processing to the world’s organizations. With their collaboration, enterprises can now accelerate task automation, adopt generative AI, and build agentic workflows for complex tasks. The developments in AI computing infrastructure announced at Oracle CloudWorld showcase the exciting possibilities that are opening up for organizations and the potential for AI to transform industries.

Frequently Asked Questions

Q1: What is the OCI Supercluster?

The OCI Supercluster is a powerful computing infrastructure designed to help enterprises train and deploy next-generation AI models using more than 100,000 of NVIDIA’s latest-generation GPUs.

Q2: What is the NVIDIA Blackwell platform?

The NVIDIA Blackwell platform is a high-performance computing platform that accelerates AI workloads and provides high-speed networking and storage solutions.

Q3: What is NVIDIA cuOpt?

NVIDIA cuOpt is a software platform that enables enterprises to quickly adopt generative AI and build agentic workflows for complex tasks like code generation and route optimization.

Q4: What is NVIDIA Omniverse?

NVIDIA Omniverse is a virtual world simulation platform that enables the creation of photorealistic, interactive, and immersive environments for various applications such as gaming, filmmaking, and architecture.

Q5: What is the future of AI computing infrastructure?

The future of AI computing infrastructure holds much promise, with continued advancements in GPU technology, high-speed networking, and storage solutions enabling faster, more efficient, and more powerful AI workloads. As AI continues to transform industries, the need for scalable and powerful computing infrastructure will only continue to grow.

Latest news
Related news
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x