BIG DATA
BIG DATA
BIG DATA
AI inference workloads are drastically changing the data storage landscape, creating a new tier of SSD infrastructure between GPU memory and traditional data lakes.
The partnership between platform builder AIC Inc. and SSD maker Solidigm, a trademark of SK Hynix NAND Product Solutions Corp., illustrates how collaborating ecosystems are shaping a broader shift in storage, according to CT Sun (pictured, right), vice president of engineering and chief technology officer of AIC Inc. That is creating a twofold challenge for AI storage: moving data fast enough to keep systems fed while also scaling to handle rapidly growing volumes.
“In AI storage, what you need is not only dealing with latency, because you need to feed a lot of data to GPUs,” Sun told theCUBE. “[You also] need enough capacity because the AI-generated data is more than [the amount of] data you get from the natural world.”
Sun and Pompey Nagra (left), product and ecosystem marketing manager at Solidigm, spoke with theCUBE’s Gemma Allen at the Nvidia GTC AI Conference & Expo for an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how SSD infrastructure is evolving to close the memory gap in AI inference architecture. (* Disclosure below.)
Scaling inference pushes the key-value cache past local GPU memory, forcing offloads to nearby flash storage. AIC, in collaboration with Nvidia Corp., has been building non-volatile memory express and data processing unit-enabled platforms to solve the problem, dating back to the technology’s first generation, according to Sun. BlueField-4, a data processing unit which integrates Nvidia’s Vera Rubin architecture, will accelerate the shift even further, he added.
“BlueField-4 will change the landscape of storage faster than we think, in the very near term,” Sun explained.
For its part, Solidigm is addressing the thermal demands of denser AI racks using a liquid-cooled E1.S SSD form factor designed for next-gen platforms, including Vera Rubin architecture. But even with those advances, the supply and demand gap across flash storage could continue for two or three more years, making strategic allocation critical, Nagra noted.
“The amount of storage capacity at low-latency that’s required in any one system grows exponentially,” Nagra said. “Liquid cooling becomes a paramount requirement in the next generation of servers and storage.”
Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the Nvidia GTC AI Conference & Expo:
(* Disclosure: Solidigm sponsored this segment of theCUBE. Neither Solidigm nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.