UPDATED 15:46 EDT / MARCH 25 2025

Dell and Nvidia create new IT paradigm with AI Factory during SC24. AI

Turning hardware into systems: How Dell and Nvidia are creating a new IT paradigm with the AI Factory

Despite the sound of its name, the AI Factory is far more than a mere assembly line turning out artificial intelligence solutions for the enterprise.

When Dell Technologies Inc. introduced a set of infrastructure offerings in March, the company positioned its AI Factory as an “end-to-end AI enterprise solution” for training and running models. With its most recent releases during the SC24 conference in November, including a number of significant enhancements based on its collaboration with Nvidia Corp., Dell signaled that it intends to move beyond traditional hardware solutions to a new world where systems will drive productivity and AI innovation.

“I think this year you’re starting to see real build-out around the infrastructure hardware and where hardware is turning into systems,” said John Furrier, executive analyst at theCUBE Research. “You’re going to start to see the game change and then the era’s here, the chapter’s closed, the old IT is over, and the new systems are coming in.”

Furrier spoke during coverage by theCUBE, SiliconANGLE Media’s livestreaming studio, of the SC24 high-performance computing conference in November. His on-site analysis was part of theCUBE’s three days of livestreaming at the event, which included interviews with Dell and Nvidia executives who provided insight into how their collaboration would help integrate AI across operations and drive digital transformation.

This feature is part of SiliconANGLE Media’s exploration of Dell’s efforts in enterprise AI. Be sure to revisit theCUBE’s analyst-led coverage of SC24. (* Disclosure below.)

Performance tools for the AI Factory

The AI-related impact of Dell’s ongoing collaboration with Nvidia came into sharper focus through a string of announcements during SC24. Dell added Nvidia GPUs to the new PowerEdge XE9685L, expanding GPU density and performance for AI and HPC workloads. The company also introduced AI Factory with Nvidia advancements, offering enhanced performance and deployment options for AI applications.

“[Customers] can start small and literally just stack that up, not only just within a rack, but create rack scale deployments,” said Adam Glick, senior director of AI portfolio marketing at Dell, in an interview with theCUBE about AI Factory solutions. “We make it super simple to be able to take the hardware. We’ve worked a lot with our friends at Nvidia.”

One of the key releases during SC24 involved RAG or retrieval augmented generation. RAG is a technique that combines large language models with information retrieval systems to generate more relevant and accurate AI results.

Dell unveiled Agentic RAG with Nvidia, a process that allows enterprise customers to scale RAG for multiple use cases and agentic workflows. The goal is to streamline performance and enable digital assistants to leverage the right data in the completion of complex tasks.

“RAG is the way to customize foundational models to incorporate proprietary data or data that you care about and want to be represented in your AI models,” said Dion Harris, director of accelerated data center GTM at Nvidia, in conversation with theCUBE. “As it relates to Nvidia, we’ve been down this path of trying to be a proponent of AI and help customers adopt AI. [We’re] providing more guidelines, blueprints, templates and APIs that make it easy to plug and play and leverage these tools.”

RAG has emerged as one of the key trends surrounding the scale-out of generative AI. It was a much-discussed element during Nvidia’s annual GTC gathering in San Jose in March 2024, and it will likely receive even more attention when the AI processor giant holds its conference again in 2025.

Leveraging Nvidia Grace and Blackwell

A key element in Dell’s AI Factory strategy for scale-out AI workloads involves the integration of Grace CPUs and Blackwell GPUs in its PowerEdge systems.

Nvidia debuted Grace as its first data center CPU for advanced AI workloads in 2021. The Arm-based superchip has proven to be an attractive tool for AI supercomputing, employed for scientific discovery and climate research based on its capacity for memory bandwidth.

“Everyone at this conference is obsessed with the idea of balance from a system design and platform design level,” said Ian Finder, group product manager of accelerated computing at Nvidia, during an appearance on theCUBE at SC24. “It’s the balance that causes you to need high-throughput storage to saturate your compute environments. Once you look into the box with Grace, we’ve architected a chip that has a tremendous amount of memory bandwidth. We have 512 gigabytes a second of memory bandwidth per socket in Grace.”

Shortly before the SC24 gathering, Dell launched its PowerEdge XE9712 storage system, built with 36 Nvidia Grace CPUs and 72 Nvidia Blackwell GPUs in one rack as part of the AI Factory. Blackwell is Nvidia’s GPU architecture for accelerated computing. Dell began shipping its Nvidia GPU and CPU equipped PowerEdge racks in mid-November, according to a company blog post.

However, the message behind Dell’s partnership with Nvidia and its leveraging of the chipmaker’s advanced processor technology for the AI Factory goes beyond memory bandwidth and high-throughput storage. As Nvidia’s Harris described it on theCUBE, the strategy has been to design new systems that support the kind of computing infrastructure that AI will demand.

“Nvidia is no longer selling a GPU,” he explained. “We’re really selling customers a data center. We’re looking at lots of different innovative ways to give more performance out of that computing stack, out of that networking stack. And oftentimes it involves driving more compute density, which therefore requires different innovations, whether it be through liquid cooling, through the bus bars, through the entire infrastructure that supports the specific computing infrastructure.”

Addressing network complexity

Infrastructure for the AI Factory will require retooling of data center technologies and then connecting to the cloud for customization. Dell and Nvidia are working with technologies such as InfiniBand and RDMA to facilitate connectivity for GPUs at scale. This is a complex task, and Dell is paying close attention to minimizing the challenges that this new infrastructure will bring for its enterprise customers.

“As enterprise deployments begin to scale out, they’re going to face and are facing similar [complexity] issues,” said Scott Bils, vice president of product management, professional services, at Dell, in an interview with analysts from theCUBE. “Helping them think through the overall design architecture, not just for today, but going forward as they scale out the environment, is a big part of the capability we bring — then, the expertise from Nvidia and our other partners in the space as well.”

Nvidia has acknowledged this potential complexity as well. As companies turn to concepts such as the AI Factory, they are reinventing their AI infrastructure, which involves more than just hardware pieces that fit together. This will involve networking and a software stack that can deliver tested, integrated and optimized AI solutions at scale.

“We packaged this up in a simplified solution that can help companies get up and running very quickly,” said Jason Schroedl, director of product marketing for enterprise platforms at Nvidia, in a conversation with theCUBE. “It’s leveraging all the best practice and learnings that we’ve done at large scale, bringing that to the enterprise, helping them wherever they are on their journey, whether they’re just getting started with generative AI or whether they’re looking to go from proof of concept to proof of value.”

(* Disclosure: TheCUBE is a paid media partner for SC24. Neither Dell Technologies nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Image: SiliconANGLE/Microsoft Designer

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU