UPDATED 16:00 EDT / NOVEMBER 30 2023

CLOUD

How Amazon is reinventing storage and data access for generative AI

Transitioning from data lakes to generative artificial intelligence tools is a significant driver for exploring fresh opportunities and embarking on a journey of possibilities.

Organizations want rapid interactions with data, so Amazon S3 is being reinvented, according to Andy Warfield (pictured), vice president and distinguished engineer at Amazon.com Inc. The goal is to provide speed, cost-effectiveness and simplicity, while also thinking about the decoupling of storage from compute and the evolving storage equation, he explained.

“The thing that I think is really interesting in here is the customer experience of curating that data,” Warfield said. “They don’t want to think about storage. They absolutely want to have good, sound practices around the structure of their data and the governance. So, as customers are looking at generative AI, they don’t want to be taking their data out of their data lake and shipping it to some external model. They really want to be bringing the model to the data.”

Warfield spoke with theCUBE industry analyst John Furrier at the “Supercloud 5: The Battle for AI Supremacy” event, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed Amazon’s focus on reinventing storage and data access, integrating with open-source tools and new technologies to provide seamless data curation.

Vector databases

Vector databases are a hot trend, but the challenge lies in interoperability and architectural lock-in, with the approach being focused on choice and the ability to put vectors where you want them to be, according to Warfield. There are examples of organizations that heavily embraced the transformation of its data practices. They made a significant investment and are now experiencing the benefits.

“The story that I loved the most was Pfizer. These examples of customers that leaned in heavily on changing their data practice,” Warfield said. “To invest in a data lake and then grow it … [they] are now realizing this agility to go and experiment with stuff like generative AI, to experiment with new things. I think that really speaks for itself in terms of what people are doing.”

Meanwhile, engineers are increasingly reengineering their environments using open-source tools, such as Apache Airflow, focusing on data engineering rather than data science or database administration, according to Warfield. With this in mind, Amazon S3 is being set up to work in conjunction with the system — with a focus on understanding the consequences of data structure on workload and investing in open source and client side for data engineering and application.

Organizations are struggling with GPU scarcity, focusing on cost and performance, while there is a relationship between chip and model developers for learning and innovation, according to Warfield.

“I think one thing that we are certainly seeing is … customers really want to keep those GPUs busy all the time,” he said. “That’s an example of the full systems view, whether it’s getting data onto the box, or getting data into GPU memory. There are innovation opportunities across that whole space.”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of the “Supercloud 5: The Battle for AI Supremacy” event:

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.