UPDATED 16:44 EDT / OCTOBER 25 2023

CLOUD

Riding the wave of gen AI: AWS outlines advances at Supercloud 4

These days, generative artificial intelligence is bringing together hype and reality at high speeds, a collision theCUBE is unpacking at Supercloud 4, theCUBE with a close look at next-generation cloud technologies, the industry and developers building next-generation apps.

Companies have It’s been a tsunami of announcements for companies working with gen AI, including for Amazon Web Services Inc., which rolled out general availability of Amazon Bedrock earlier this year. There have been developments since then, noted Bratin Saha (pictured, right), vice president and general manager of AI and machine learning at AWS.

“We have added more models to Bedrock,” he said. “Since the announcement earlier this year, we also added Cohere. And we are really excited at how customers are starting to use the models in Bedrock.”

Saha spoke with theCUBE industry analyst John Furrier (left) at Supercloud 4, during an exclusive broadcast on theCUBE, SiliconANGLE Media’s livestreaming studio. They discussed how companies are using the AI models and moving into production at scale.

Customers are looking for choice

Amazon CodeWhisperer was also a focus earlier this year, which relaunched a customization capability so customers could customize it for their own code basis and own styles of coding, according to Saha. The company is also excited about what customers can do with Trainium and Inferentia and the performance improvements and performance per dollar improvement that comes with those.

“A lot of customers are continuing to build generative AI models on top of SageMaker,” Saha said. “All in all, going in from Bedrock being the easiest way to build generative AI apps with its gen AI capabilities and its foundation models, to CodeWhisperer, to the infrastructure, we have the hardware infrastructure, which is purpose-built for generative AI. It gives you the best cost and performance.”

AWS also provides the machine learning infrastructure and software infrastructure for customers to build their own foundation models, Saha noted. There are a lot of models available on SageMaker JumpStart, including open-source models, as AWS believes a single model isn’t going to work necessarily for all use cases.

“Customers want this choice, and that is why we are excited about it. We actually see the industry moving in this direction,” Saha said.

Deploying gen AI at scale

These days, the industry is starting to see models moving into production, which poses questions about how to move into production at scale at a cost structure. Meanwhile, as people build their own models, they’ll need to interact with the big three or four proprietary models through APIs, managing compute and horsepower.

“It comes back to deploying generative AI at scale in the enterprise is a different kettle of fish than just having demos of consumer apps,” Saha said.

That’s because many factors come into play, including accuracy, cost, latency and performance. It’s a multidimensional, multifaceted optimization problem. “That is where having a choice of models is so important, because there are use cases where you may want to use a particular model that has been fine-tuned on your data,” Saha said.

The question remains: How might one make it easier for customers to deploy to production? Bedrock makes that easy, according to Saha. “We have done a lot of the heavy lifting of taking these models, training it, optimizing it, and we are going to do a lot of optimizations in terms of reducing the inference cost,” he said.

The second thing to consider when thinking about these models is how one might make them scale as products are being used by many consumers. That’s where a lot of heavy lifting is involved in getting the right latency, traffic control and auto-scaling.

“All of that is taken care of behind-the-scenes by Bedrock,” Saha said. “Now, there might also be situations where customers actually build their own proprietary models or take a model and fine-tune it and then deploy it. There are two options there. One, if you go do it with Bedrock, you actually get your own private copy of the model.”

When wanting to fine-tune the model because one is getting one’s own private copy of the model, the data doesn’t get into the base model. But there might also be other cases where one wants to take one of these open-source models that actually works well, Saha noted.

“That is where we have a number of features that we have added in SageMaker so that we can reduce the cost by an order of magnitude,” he said. “There are a couple of features that are now in private beta with customers that we’ll hopefully launch at re:Invent. And those are things that reduce cost by orders of magnitude. These are all the things that we are really thinking through.”

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s coverage of Supercloud 4:

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU