UPDATED 11:53 EST / OCTOBER 03 2023

AI

Major industry players aim to tame the database monster

Enterprise storage is in the midst of a striking transformation. Major players are eyeing next-generation storage solutions, as projections peg the value of that market as crossing the $150 billion U.S. threshold by 2032.

Along with that trend comes the importance of cyber and data resilience amid the rise of artificial intelligence. It’s becoming clear to these major players that one can’t do AI without a cyber-resilient infrastructure architecture.

For decades, participants in the storage business were rewarded for building or buying systems that were fast, cheap and didn’t fail. But today, the shift to software-led storage accelerated innovation by creating an abstraction layer that can be leveraged to simplify management and bring cloud-like experiences to on-prem and hybrid estates.

“It also enabled value to be delivered more quickly and injected into storage systems to support greater business resilience and more flexible features,” said theCUBE industry analyst Dave Vellante at the recent IBM Storage Summit, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. “The future of storage will increasingly be characterized by unlocking data value with AI and providing a foundation for businesses to better anticipate and withstand market shifts.”

This new world presents certain challenges but also brings with it various potential benefits. Big players have sought to respond to this new landscape with next-generation storage solutions, including IBM Corp., Pure Storage Inc., Hewlett Packard Enterprise Co., Dell Technologies Inc. and newcomers like Vast Data Inc.

Meanwhile, data platforms are coming to prominence and increasing importance, as AI causes a significant increase in data supply chains. But that brings forward the next challenge: How might a company tame the so-called “database monster?” Here’s how the major players are responding to that challenge and why it’s becoming increasingly clear that companies can’t do AI without a cyber-resilient infrastructure architecture.

[This feature is a part of an ongoing series, made possible by IBM, as theCUBE explores the infrastructure angle for AI enterprise use cases.]

Balancing cloud-native apps and storage

Cloud-native applications — designed and built with scalable, resilient and highly available cloud services — often utilize SQL and NoSQL databases and datalakes, as they can be easily scaled horizontally and vertically to meet application demands. But as application development teams make the decision to move to cloud-native architecture and leverage containers, storage teams have a big challenge ahead of them.

Those teams need to figure out how to build an infrastructure that is elastic and can support new models of development while providing high levels of availability and data resilience. But unstructured data growth is very difficult to manage, which leads to companies leveraging new technologies for lakehouses and data platforms— whether that comes from Snowflake Inc., Databricks, or whether it’s IBM’s watsonx.data. The goal is to try to help storage administrators figure out how to bring all of that together while ensuring high levels of cyber resilience, according to Sam Werner, vice president of IBM Storage product management at IBM.

“The threats have gotten so much more significant. Storage administrators now [have] to protect from a hardware failure, from logical data corruption, from application developers making a mistake,” Werner told theCUBE. “But now you have targeted attacks come in. It’s completely different how you recover.”

When it comes to advancing cyber-resilience, storage has a big role to play. What people may not have paid much attention to are the impacts of unstructured data, according to Scott Baker, chief marketing officer and vice president of IBM Infrastructure Portfolio product marketing at IBM.

That doesn’t just have to do with growth, but also with the amount of risk that unstructured data can pose to an organization if they don’t know how it’s been classified, categorized and valued in a business.

“We spend a lot of time focused on things like block-based storage devices, where you have some inherent protection that’s in that, where applications have direct access to the data,” Baker said during an exclusive broadcast on theCUBE. “You build your strategy, whether it’s cyber resiliency or data resiliency, around that one-to-one relationship, if you will.”

The company points to the modern lakehouse, which leverages advanced data management and provides easy access to diverse data sources across hybrid clouds while maintaining the advantages of traditional data warehouses and data lakes. The key to keep in mind? Storage optimization with cost-effective hybrid cloud-scale object storage, high-performance storage acceleration and integrated data governance.

The race for cloud-native AI storage: Adapting to new data demands

As companies have increasingly sought to introduce AI into their product offerings, so too have they had to juggle the implications of handling the vast amounts of data required. More than 85% of organizations will embrace a cloud-first principle by 2025 and will not be able to fully execute their digital strategies without the use of cloud-native architectures and technologies, according to Gartner Inc.

Pure Storage acquired Portworx Inc. at the end of 2020, understanding that traditional storage can’t keep up when cloud-native solutions are becoming increasingly important. That product offering is intended to provide Kubernetes data services with the performance, data protection and security that modern apps require, according to the company.

“Cloud-native is exploding; containers are exploding,” Murli Thirumale, vice president and general manager of the Cloud Native Business Unit of Portworx, said earlier this year. “It’s kind of a well-known fact that 85% of the enterprise organizations around the world are pretty much going to be deploying containers, if not already, in the next couple of years.”

Meanwhile, Databricks announced earlier this year that it was acquiring startup Rubicon Inc., an AI storage specialist founded by former Google LLC and Dropbox Inc. engineers. There’s likely to be a battleground in the year ahead for the future of AI-powered data apps, according to theCUBE industry analyst Rob Strechay. That’s because, in the middle of the last decade, evidence of a new data stack became more clear with cloud data platforms that separated compute from storage, simplified data warehousing and brought AI and machine learning to the table.

“Fast forward to the end of the decade, and the massive data management market opportunity pointed to a collision course between Snowflake and Databricks to become the dominant platform on which to build modern data apps,” Strechay said.

For IBM, its IBM Storage Fusion is intended to provide the data services that support the scaling and resiliency required by the modern database. Storage Fusion accelerates IBM watsonx data queries.

“From application development to data science to infrastructure modernization, IBM Storage Fusion helps organizations navigate cloud-native technologies with a simple, consistent and highly scalable platform without the need to understand the underlying hybrid cloud infrastructure,” the company said in describing Storage Fusion.

Where does the industry go from here?

In this new environment, with AI a chief area of focus, companies must seek storage systems that are flexible and elastic data platforms. It’s not just cloud scale-up and scale-down but new types of applications and digital representations of a business. But to get real value out of AI, an intentional strategy needs to be implemented, according to Werner.

“You have to be able to combine your own data with these foundation models, right? You have to bring together the data that you have as an organization, which is your knowledge of your industry, your knowledge of your customers,” he said. “This data exists everywhere. How do you bring that data together with a foundation model, whether it’s a large language model or some other specific industry vertical-type model you’ve created?”

What’s necessary is a way to bring that data together, combine it and do training, according to Werner. Being able to get insights in real time means a company also needs to be able to access that data from multiple sources.

“You can’t rely on ingesting everything. It’s too expensive; it’s too complicated,” he said. “That’s what we’re doing. We’re working on technology that virtualizes that.”

The months to come will continue this conversation as major players respond to a storage industry undergoing rapid transformation — and as the focus on cyber and data resilience in an AI-driven future remains a critical focus.

Image: SIphotography / Getty Images

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU