AWS re:Invent 2024: CEO Matt Garman unveils the future of cloud with generative AI and agentic workflows
As AWS re:Invent 2024 approaches this coming week, anticipation is building for what promises to be a defining moment in the evolution of cloud computing.
In an exclusive interview at Amazon Web Services Inc.’s Seattle headquarters, I sat down with Chief Executive Matt Garman (pictured) to delve into the themes that will shape re:Invent and AWS’ strategy for the coming years.
Video exclusive: A conversation with theCUBE
From a renewed focus on core infrastructure to advancements in custom silicon, generative artificial intelligence and agentic workflows, Garman’s insights offer a roadmap for the future of cloud computing.
Accelerated cloud adoption fueled by generative AI
Businesses worldwide are rapidly embracing cloud technologies to drive innovation and revenue growth. Garman highlighted this trend, noting that customers are moving workloads to the cloud to capitalize on the latest advancements.
“We’re seeing a lot of people really leaning into the cloud,” Garman said. “They realize that they’ve got to get those workloads into the cloud so they can have the agility to take advantage of all the latest technologies that are coming out.”
Generative AI has emerged as a key catalyst in this shift. Companies recognize that to harness the full potential of AI models, they must migrate their data and applications to a scalable, flexible cloud environment.
“A lot of customers are seeing that gen AI is one of the things that’s driving that,” Garman noted. “They realize that cloud piece is there. We see that as a real tailwind to the business and something that customers are really excited about moving.”
Back to basics: core infrastructure innovation
While generative AI captures headlines, Garman emphasized that AWS remains deeply committed to innovating its foundational services — compute, storage and databases. The upcoming re:Invent conference will not only spotlight AI advancements but also significant enhancements in these core areas, including new high-performance silicon chips such as Trainium 2.
“We’ll have a lot of announcements in things about AI and generative AI,” he revealed. “But we’re also going to have a lot of really cool innovations and announcements around the core of our infrastructure — thinking about compute, storage, and databases.”
This dual focus is the core strategy that AWS is executing to meet the foundational needs of its customers, enabling them to adopt gen AI systems and applications while powering cutting-edge solutions.
“It’s kind of the core of what we do,” Garman affirmed. “Our customers are hungry for continued innovation across that whole stack, not just in the shiny new objects.”
AWS revolutionized the tech industry by creating the core foundational pillars of the cloud, and Garman made it clear that AWS isn’t resting on its laurels. Instead, it is evolving these pillars to meet the demands of modern workloads and prepare for a future defined by AI-driven applications.
“We’re also going to get back to some of the basics on just how we go and help customers innovate in all the different areas that they get to,” Garman said.
AWS’ focus on the fundamentals is a direct response to customer feedback. Enterprises today need scalable, cost-effective and high-performing solutions for their legacy and cloud-native workloads. This is especially critical as the majority of enterprise data still resides on-premises.
“The vast majority of our customer workloads are still on-premises,” Garman noted. “That actually will deliver huge amounts of value to their business so they can take advantage of these new capabilities.”
At re:Invent 2024, AWS will announce innovations across its core services, pushing these foundational components optimized for gen AI workloads, data-intensive applications, and distributed computing environments. Expect enhancements that make migrating to the cloud faster, easier and more cost-effective, empowering customers to harness the agility and scalability of AWS.
Inference: the next core AWS building block
A significant theme emerging from our conversation is the critical role of inference in modern applications. Inference, the process by which AI models generate predictions or outputs, is becoming integral to a wide range of applications. Garman sees inference as a foundational component of AWS services, on par with compute, storage and databases.
“Inference is the next core building block,” Garman stated unequivocally. “If you think about inference as part of every application that you go build, it’s not just a separate generative AI application and then all my other applications — it’s just an integral part, just like you would think about databases.”
This bold assertion reframes how AWS thinks about generative AI and its role in the cloud. Traditionally, AI has been viewed as a specialized domain, separate from core cloud infrastructure. Garman made it clear that this perspective is changing rapidly.
“It’s not a database application and a non-database application. It’s just a part of what you might use,” he explained.
AWS is focused on lowering the cost of inference and helping customers integrate it into production environments to drive real enterprise value.
“We are pushing really hard to continue lowering the cost of inference,” he said. “Over the last year, the cost of inference has been coming down significantly.”
Making inference ubiquitous
The key here is integration. Garman’s vision is not to treat inference as an isolated capability but to embed it deeply into the fabric of application development. Just as databases are foundational to virtually every business application, inference is poised to become a baseline feature of cloud-native workflows.
“If you think about it that way, then you’ve got to think about how it gets integrated into the fabric of what you’re doing? That means you think about data workflows, how these models might interact really well with S3, EC2, your databases, and how they all kind of work together,” Garman said.
Rather than building siloed AI solutions, AWS is focused on creating tools and platforms that seamlessly integrate AI capabilities with its existing services. Whether it’s enabling AI models to interact with S3 for storage, EC2 for compute, or AWS’ suite of database offerings, the goal is to make inference an invisible but essential part of every application.
Why inference matters now
The maturation of AI adoption makes inference critical at this moment. Initially, businesses experimented with proof-of-concept projects such as chatbots, search enhancements, or other rudimentary decision-making tools. Today, organizations are looking to embed AI capabilities directly into their operational workflows to drive measurable business value. This requires inference to move from the periphery to the core of application architecture.
“When you’re thinking about startups out there or enterprises trying to solve a problem and they’re going and building a new application, it’s (inference) just a tool that they can get really powerful and exciting capabilities deeply embedded into their application,” Garman explained.
This shift underscores the need for cloud providers to make inference both accessible and cost-effective. AWS is investing heavily in reducing the costs of inference while ensuring that its customers — whether they’re startups or large enterprises — have the tools they need to succeed.
To achieve this, businesses need robust frameworks that can support inference in real-world production environments. This means scalable infrastructure, integration with existing data and workflows, and cost optimization to make inference economically viable across industries.
AWS’ focus on inference is to make it a core building block in their infrastructure and platform services, positioning itself as a critical partner in this AI business transformation wave.
The future of application development
If inference is as transformative as the invention of the database, the future of application development will look fundamentally different. Applications will evolve into dynamic, AI-powered tools capable of learning, adapting and delivering value in real time, radically changing how we conceptualize and write software.
“The entire way that you build generative AI applications is going to change and be reinvented,” Garman asserted. “Inference is the next core building block. If you think about inference as part of every application, it becomes integral, just like databases.”
AWS is driving to play a leading role in this evolution by providing the infrastructure and integration points necessary to make inference seamless. By treating inference as a foundational component, AWS enables developers to innovate at scale and unlock new possibilities.
Integrating generative AI into the fabric of applications
Garman posits that generative AI should not be viewed as a separate entity but as an integral component of modern applications. This aligns with AWS’ strategy to embed AI functionalities seamlessly into its services, making AI capabilities as ubiquitous as databases or storage systems.
“I actually think they’re just applications,” he said. “Inference is the next core building block. It’s about integrating it into the fabric of what you’re doing — data workflows, S3, EC2, databases — all working together.”
This holistic approach simplifies the development process, allowing developers to incorporate AI features without extensive overhauls of existing systems. AWS aims to make it easier for customers to deeply embed powerful and exciting capabilities into their applications.
“Our job is to make it easier for customers,” Garman emphasized. “So that startups and enterprises can deeply embed powerful capabilities into their applications.”
By treating inference as a foundational building block, AWS wants to enable developers and businesses to innovate without complexity, integrating AI capabilities seamlessly into their application development processes.
The emergence of agents and managing multi-agent systems
Looking ahead, Garman highlighted the next evolution in AI: agents. Agents are autonomous systems capable of performing tasks and executing complex workflows.
“The next step is automating tasks — that’s what agents are all about,” he explained. “We’re at the point where you can have thousands of agents executing tasks, but that gets complicated fast.”
Managing agents at scale introduces new challenges in terms of scalability, resilience and security. AWS is developing frameworks and tools to help customers handle multi-agent systems efficiently.
“We want to make it possible for people to manage agents at scale with frameworks to help them,” Garman said. “Because if you have 1,000 or 10,000 or 100,000 agents out doing things, pretty soon that process gets unmanageable.”
By offering solutions to orchestrate agentic workflows, AWS plans are to unlock new levels of automation and efficiency for businesses.
Embracing serverless paradigms and abstracting complexity
Reflecting on the success of AWS Lambda and the serverless movement, Garman drew parallels to the current trajectory of AI integration.
“Just because it’s a new technology doesn’t mean that customers all of a sudden want to manage infrastructure,” he remarked. “They actually just want to get the capabilities.”
AWS’ strategy involves abstracting the underlying complexities of AI services, much like it did with serverless computing. Services like AWS Bedrock and SageMaker are designed to provide powerful AI capabilities without the need for customers to manage intricate details.
“As you think about agentic workflows, again, Lambda is a key part of that,” Garman noted. “You can have this bit of compute that helps the models know where they’re going to go.”
Addressing customer challenges and legacy modernization
Despite rapid advancements in generative AI, many customers face the daunting task of modernizing legacy systems. Garman acknowledged this reality and stressed the importance of assisting customers with foundational challenges.
“They want help,” he said candidly. “There’s a ton of technology that’s flying at them.”
AWS recognizes that to fully leverage new technologies like AI, customers must first migrate and modernize existing workloads.
“They also tell us, ‘Don’t forget about the fundamentals of the things that I need to do,'” Garman shared. “We can’t forget some of those fundamentals. We’ve got to get that mainframe moved to the cloud.”
By providing support for legacy modernization, AWS ensures that customers can access new innovations, regardless of where they are on their cloud journey.
Strengthening the partner ecosystem
As innovation accelerates, startups and developers across the industry are facing resource limitations, particularly in accessing specialized hardware like GPUs. These capacity constraints are not unique to AWS but are a widespread challenge impacting the tech sector. Garman acknowledged this industry-wide issue and outlined how AWS is working diligently to alleviate it.
“There is more opportunity for partners than there has ever been,” he asserted. “Our partners are key. Customers need help, and they’re going to need help to make it easier — they’re going to need help to move faster.”
This collaborative approach underscores AWS’ commitment to giving access to advanced computing resources, enabling startups and developers to innovate without prohibitive costs.
“We’re not going to get where we need to be, frankly, as a business for AWS and all of our customers, if we can’t have that partner ecosystem help us,” Garman emphasized.
Supporting developers and startups amid industrywide capacity constraints
As innovation accelerates, startups and developers often face resource limitations, particularly in accessing specialized hardware such as graphics processing units. Garman acknowledged these challenges and outlined AWS’ efforts to alleviate them.
“Startups are near and dear to my heart,” he said. “We’re driving as fast as possible to get capacity. We’re adding capacity as fast as humanly possible. We’re adding power as fast as possible, and we’re thinking about how we offer lower-priced offerings.”
The goal of AWS’ investment in expanding capacity is to democratize access to advanced computing resources, enabling startups to innovate without prohibitive costs.
“We’re both thinking a lot about how we make sure that we just have more capacity for everyone,” Garman explained. “But also how do we have lower-cost options so that people have options out there.”
Investing in infrastructure and sustainable energy
Addressing the immense capital expenditures required for AWS’ growth, Garman discussed the company’s approach to infrastructure investment and sustainability.
“It turns out as you’re growing as fast as we are, it means you need to add data centers, you need to have servers,” he stated. “That is the business that we’re in.”
AWS balances aggressive expansion with responsible corporate governance, ensuring that investments align with customer demand and environmental considerations.
“We’re very intentional about how we go spend,” Garman said. “We’re very focused on carbon-zero energy, and I think we spend a ton on adding new carbon-free energy to the grid out there.”
Exploring sustainable energy options, including nuclear power, forms part of AWS’ broader strategy to minimize its environmental footprint.
“Nuclear is a fantastic additional option to that portfolio,” he added.
Encouraging innovation and building
At the heart of AWS’ mission is a desire to empower customers to innovate. Garman’s message for re:Invent attendees encapsulates this ethos.
“We want people to go out there and build,” he declared. “We build a lot of the technologies and we develop the services that we have so that our customers can go build and they can go invent.”
This focus extends beyond technology, Garman invites customers to rethink their business processes and embrace new possibilities.
“Come thinking about your business and thinking about the processes and thinking about how you could do things differently,” Garman advised. “It’s not about making 5% improvements. It really is about making stepwise changes in the capabilities that whole industries are able to accomplish.”
Final thoughts
AWS re:Invent 2024 will be Garman’s inaugural keynote as CEO. I’m expecting him to underscore AWS’ commitment to its foundational strengths while boldly clarifying the new innovations around gen AI. For enterprises, startups and developers alike, re:Invent is poised to be the moment where cloud meets generative AI and agentic workflows.
“Everybody’s going to need to tune in for re:Invent,” Garman teased. “We’re very excited about many of the announcements.”
For customers, partners, developers and startups, Garman’s message is clear: AWS is committed to providing the tools, support and innovations necessary to drive the next wave of digital transformation.
“Our job is to make that easier for customers,” Garman concluded. “So that when you’re thinking about startups out there or enterprises trying to solve a problem and they’re going and building a new application, it’s just a tool that they can get really powerful and exciting capabilities deeply embedded into their application.”
AWS re:Invent 2024 will be Garman’s inaugural keynote as CEO. I’m expecting him to underscore AWS’ commitment to its foundational strengths while boldly clarifying the new innovations around gen AI.
Photo: SiliconANGLE
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU