UPDATED 13:00 EDT / MAY 19 2025

AI

Dell builds out its AI offerings with an emphasis on on-premises deployment

Kicking off its Dell Technologies World conference this week, Dell is expanding its AI Factory artificial intelligence product portfolio with infrastructure, software and services updates.

Its stated goal is helping enterprises move from experimentation to scaled deployment of AI without using commercial cloud services. Dell said on-premises infrastructure is more efficient for AI inference at scale, especially as enterprise customers become more sensitive to cloud costs, data sovereignty and operational flexibility. The company cited Enterprise Strategy Group Inc. research that found AI Factory can save up to 62% of the cost of inferencing large models compared to public cloud options.

“We’re seeing up to 75% lower total cost of ownership running LLMs [large language models] on-prem with Dell infrastructure compared to the public cloud,” said Sam Grocott, senior vice president of product marketing at Dell.

AI laptop

Today’s announcements span the spectrum from portable inference at the edge to high-density training clusters in data centers. At the edge, the new Pro Max Plus laptop (pictured) includes 32 AI cores, 64 gigabytes of memory and a Qualcomm Inc. AI 100 PC inference card, making it the first mobile workstation with an enterprise-grade discrete neural processing unit, according to Dell. It’s aimed at running large models that typically require cloud resources locally, with support for 100 billion-parameter-plus LLMs.

In the data center, Dell introduced a new approach to AI-scale thermal management with a system designed to absorb 100% of server-generated heat using a self-contained airflow design. The PowerCool Enclosed Rear Door Heat Exchanger also operates with higher water temperatures of between 32°C and 36°C to reduce reliance on traditional chillers.

Dell said the heat exchanger can cut cooling-related energy costs by up to 60% and allow up to 16% greater rack density without increased power consumption. Other features include leak detection, real-time thermal monitoring and unified rack-level management.

New servers in the PowerEdge lineup — the PowerEdge XE9785 and XE9785 — support Advanced Micro Devices Inc.’s new Instinct MI350 graphic processing units with 288 gigabytes of High Bandwidth Memory 3E, a new generation of memory technology designed for high-performance computing applications. Dell said the new platforms deliver up to 35 times better inferencing performance compared to previous systems while also reducing cooling demands through liquid and air-cooled options.

Campaign to disaggregate

The announcements are part of a continuing Dell push to promote what it calls “disaggregated architecture,” in which computing, storage, memory and networking are managed separately.

“Disaggregated infrastructure is where the puck is going for our customers because it combines the flexibility of three-tier with the simplicity of hyperconverged to enable dynamic resource allocation from a shared resource,” said Varun Chhabra, senior vice president of infrastructure and telecom marketing at Dell.

As part of its partnership with Nvidia Corp., Dell announced a new generation of PowerEdge servers designed around Nvidia Blackwell Ultra GPUs. Designed for LLM training and inference, the servers scale up to 256 GPUs per rack in liquid-cooled configurations and can deliver up to four times faster model training compared to the previous generation, Dell said.

A new high-density PowerEdge XE9712 server features the Nvidia GB300 NVL72 rack-scale server optimized for training and offering a 50-fold gain in inference output and five-fold throughput improvements. Another model, the XE7745, will support the Nvidia RTX Pro 6000 Server Edition starting in July 2025. It’s aimed at robotics, digital twins and multi-modal workloads.

New networking options include the Dell PowerSwitch SN5600, SN2201 Ethernet, part of the Nvidia Spectrum-X Ethernet networking platform and Nvidia Quantum-X800 InfiniBand switches. The switches deliver up to 800 gigabits per second of throughput and are backed by Dell ProSupport and Deployment Services.

On the software front, Dell AI Factory is now validated to use Nvidia NeMo microservices and tools for retrieval-augmented generation and agentic workflows. The AMD partnership includes upgraded support for the Radeon Open Compute and Day 0 container models like Llama 4. Dell also extends Red Hat OpenShift support to the AI Factory for managing containerized AI workloads.

AI reference architecture

Noting that models require reliable access to quality data, Dell announced updates to its ObjectScale object storage platform to support more compact configurations and integrate with Nvidia BlueField-3 and Spectrum-4 networking. New S3 over Remote Direct Memory Access support promises throughput performance gains of up to 230%, 80% lower latency and 98% less CPU overhead, all of which help improve GPU utilization.

A new Dell reference architecture combines PowerScale storage, Project Lightning, PowerEdge servers, and Nvidia’s NIXL libraries for large-scale LLM inferencing. Dell will also integrate support for Nvidia’s AI Data Platform for agentic application development.

“Internal testing shows that, when it rolls out later this year, the AI Data Platform will be the fastest parallel file system in the world,” Chhabra said. “It will outpace traditional file systems by up to 2X and offer 67% faster data access compared to its nearest competitor.”

Image: Dell

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.