Industry partnerships drive Supermicro’s storage approach for AI server solutions
Partnerships are playing a key role for Supermicro Computer Inc., as it develops AI server solutions that can meet the demands of graphics processing units or GPUs specifically tailored for artificial intelligence workloads.
Supermicro’s contribution in this important area received a boost in June when the company introduced rack scale plug-and-play liquid-cooled AI SuperClusters for Nvidia Corp.’s Blackwell GPUs. Supermicro is one of only a handful of players that can develop customized servers capable of powering Nvidia GPUs for AI workloads.
This latest series of enhancements underscored the role of networking and GPU direct storage as data goes through various stages of extraction, cleansing and transformation in the preparation of AI workloads.
“We have networks from Nvidia that are also supplied by Supermicro, including switches and adapters in both in InfiniBand and Ethernet,” said Rob Davis (pictured, back row, left), vice president of storage technology at Nvidia. “Those are connected all the way into the GPU servers using a technology called GPU direct storage. What GPU direct storage does is eliminate the interface to the local central processing unit and move data directly from the network storage into GPU memory.”
Davis spoke with theCUBE Research’s Rob Strechay at the Supermicro Open Storage Summit series, brought to you by Supermicro, Intel, AMD and Nvidia, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. He was joined by William Li (front row, left), director of solution management at Supermicro; Steve Hanna (front row, right), head of product management of high-capacity NVMe SSDs at Micron Technology Inc.; Shimon Ben-David (front row, center), chief technology officer of WekaIO Inc.; and Jon Toor (back row, right), chief marketing officer of Cloudian Inc. They discussed AI server solutions and how optimized GPU compute can meet specialized requirements for training large AI models. (* Disclosure below.)
Facilitating storage speed for AI server solutions
The reason why Supermicro and its partners, such as Nvidia, have been focused on storage is that processing for AI workloads relies on being able to rapidly move large amounts of data from various repositories. Supermicro has designed its Reference Architecture to help businesses fulfill unique high-performance computing requirements.
“Without fast storage, AI will be slowed down,” Li explained. “That’s why we have introduced Supermicro Reference Architecture for AI use cases. We have been working with all these great partners for a long time on total solutions, including GPU storage servers, high-performance computing, as well as the fastest network devices.”
Much of the current focus has been on object storage, a highly scalable technology for storing a wide range of data in structured and unstructured formats that can be accessed in numerous hardware platforms. Unstructured data management can be a challenge in the AI era, and WekaIO has built its platform so that enterprise data stacks can handle a variety of IO patterns, data types and sizes at increasing volumes and velocities.
“We see customers currently using multiple different storage environments, maybe they are using object store of high capacity ingest, and then they are moving it to a parallel file system for preprocessing the data, cleaning, tagging and reaching, and then training the data,” Ben-David said. “We are working with our partners, with Supermicro and Micron, on larger capacity effective drives and with our partners in Cloudian around our ability to virtualize the capacity of an entire environment by an object store. WekaIO manages the whole data movement if needed, and that is cost effective at scale.”
Simplifying large-scale AI deployments
The movement of large amounts of data across diverse workflows can still create complexity. To address this, Cloudian announced a strategic collaboration with Supermicro this month that delivered a data management solution designed to simplify and accelerate large-scale AI deployments. Cloudian’s software, deployed on Supermicro’s storage servers, is optimized for next-generation AI workflows.
“This is software that runs on standard servers, and you can start with a small number of servers, as few as three, and then you can grow that limitless within a single storage environment,” Toor said. “What this means for your AI workflow is you’ve now got a single source of truth, an S3 compatible storage pool that you can use for all your different workflows from data ingest and machine learning to data analytics to deploying your model.”
The storage market is also experiencing an industry transition from hard disk drive to solid state drive technology, and this is having an impact in how users leverage server technology for the processing of AI workloads. Micron’s release of its 9550 PCIe Gen 5 SSD in July highlighted the increasingly important role that high capacity drives are now playing in data center infrastructure.
“It’s explicitly designed for GPU feeding, AI training and caching,” Hanna said. “The HDD to SSD transition is a fairly recent phenomenon in these networked data lakes where before people would buy the GPUs, but they would underinvest in their storage. The data set sizes are just getting absolutely massive, you need more storage. The speed in which we feed that pipe is the bottleneck, so we have to use SSDs by definition, to keep the GPUs utilized. You just have to do it.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE Research’s coverage of the Supermicro Open Storage Summit series:
(* Disclosure: TheCUBE is a paid media partner for the Supermicro Open Storage Summit series. Neither Super Micro Computer Inc., the sponsor of theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Photo: SiliconANGLE
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU