Coverage from SiliconANGLE's livestreaming video studio

UPDATED 11:43 EST / DECEMBER 03 2024

AI

Dave Brown talks Trainium2: AWS’ secret weapon for generative AI leadership

by John Furrier

Prior to Amazon Web Services Inc.’s re:Invent conference this week, Dave Brown, vice president of compute at AWS, shared a glimpse with me into the future of cloud computing.

During an exclusive podcast interview, Brown unveiled the company’s newest silicon innovation — Trainium2 — and delved into how AWS is redefining the infrastructure landscape to meet the burgeoning demands of generative AI.

For years, AWS has been a driving force behind enterprise cloud computing, but as generative AI reshapes industries, the stakes have never been higher. Trainium2 exemplifies the infrastructure innovation that AWS has cultivated, promising a blend of raw performance and price efficiency that Brown believes will revolutionize AI workloads for enterprises of all sizes.

The power of purpose-built silicon

According to Brown, AWS’ foray into custom silicon began with a simple yet powerful question: How can cloud providers maximize performance while controlling costs? Trainium2 is the latest answer. Purpose-built for AI and machine learning workloads, the new chip delivers an impressive fourfold performance improvement over its predecessor, Trainium1. Brown emphasized its importance, stating, “Generative AI is transformative, but for it to scale, price performance must be prioritized.”

Each Trn2 instance boasts 16 Trainium2 chips interconnected via AWS’s proprietary NeuronLink protocol. This configuration allows workloads to utilize high-bandwidth memory and unified memory access across accelerators, enabling large-scale AI models to perform at unprecedented speeds. “This chip is our most advanced yet,” Brown said. “It’s designed to tackle the immense computational requirements of generative AI while keeping costs manageable.”

Early adopters such as Anthropic and Adobe have already integrated Trainium2 into their operations, leveraging its 30% to 40% price-performance advantage over competing accelerators. “When you’re training large language models with thousands of chips, a 40% savings can mean millions of dollars,” Brown noted.

Generative AI meets democratized supercomputing

The AI revolution has created a renaissance in high-performance computing, an area traditionally dominated by elite industries like aerospace and defense. With the benefits of speed and cost-efficiency. AWS is democratizing access to supercomputing resources. According to Brown, a cornerstone of this effort is its Capacity Blocks offering, which allows customers to reserve compute resources for short-term projects. Brown explained, “Instead of committing to hardware for years, enterprises can access cutting-edge chips like Trainium2 for a week or even a single day.”

Capacity Blocks have opened the door for startups and enterprises alike to explore ambitious projects, from indexing vast data lakes to training proprietary models. “What used to take months and millions of dollars is now accessible to companies of all sizes,” Brown said. “That’s the true promise of cloud computing.”

A new compute stack for AI-driven enterprises

AWS’ layered approach to infrastructure ensures flexibility for diverse customer needs. At the foundational level, SageMaker simplifies machine learning operations by acting as an orchestrator for compute jobs. Brown described SageMaker as “mission control,” managing node failures and optimizing clusters for training and inference workloads. For developers and enterprises seeking rapid deployment, Bedrock offers an abstraction layer for foundational AI models such as Meta Platforms Inc.’s Llama and Anthropic PBC’s Claude.

This stack allows AWS to cater to a wide spectrum of use cases. “SageMaker is ideal for those who need granular control, while Bedrock abstracts complexity, letting users focus on innovation rather than infrastructure,” Brown said. “It’s about meeting customers where they are in their AI journey.”

The rise of custom silicon and strategic artnerships

AWS’ investment in custom silicon isn’t just a technological differentiator — it’s a strategic necessity. The company’s partnerships with industry leaders such as Nvidia Corp. complement its in-house innovations, creating a versatile ecosystem. Brown highlighted Project Ceiba, a 20,000-GPU cluster built in collaboration with Nvidia. “Our goal is to make AWS the best place to run Nvidia hardware while continuing to innovate with our own silicon,” he said.

AWS’ partnership with Anthropic highlights the transformative potential of Trainium2 infrastructure. Brown revealed that AWS is building a groundbreaking cluster of Trn2 UltraServers for Anthropic, containing hundreds of thousands of Trainium2 chips. According to Brown, this cluster delivers over five times the exaflops of computational power used to train Anthropic’s current generation of AI models. Leveraging AWS’ elastic fabric adapter network, the tightly coupled design ensures unparalleled efficiency and scalability, crucial for training large language models.

“A 40% cost savings on a cluster of this magnitude is incredibly significant,” Brown emphasized. This unique integration highlights how AWS’s next-generation infrastructure drives differentiation with partners like Anthropic to push the boundaries of what’s possible in AI development, making breakthroughs more accessible and cost-effective for enterprises globally.

Yet AWS’ commitment to hardware goes beyond collaboration. The Trainium and Graviton chip families illustrate how the company has steadily refined its silicon expertise. Brown traced this evolution back to the company’s 2015 acquisition of Annapurna Labs, calling it “one of the most transformative deals in the industry.”

The future of compute: Tackling complexity at scale

Building and maintaining high-performance compute systems is no small feat. AWS has embraced innovations like water cooling in its data centers to accommodate the thermal demands of modern accelerators. Brown explained, “When chips consume over 1,000 watts per accelerator, traditional air cooling just doesn’t cut it.”

Operational challenges extend beyond cooling. The scale at which AWS operates allows the company to identify and resolve hardware faults that smaller data centers might never encounter. “At our scale, we’re able to fix issues proactively, ensuring stability and performance for our customers,” Brown said.

While generative AI has captured the spotlight, Brown is quick to point out that AWS’ innovation extends across the Compute stack. Kubernetes, often described as “the new Linux,” remains a focus, with AWS introducing new features to simplify container orchestration. “Generative AI is exciting, but we’re also pushing the envelope in other areas of infrastructure,” Brown said.

Looking ahead, AWS plans to continue its rapid pace of innovation. Brown hinted at the development of Trainium3, which promises even greater performance gains. “We’re just scratching the surface of what’s possible,” he said. Indeed, today AWS announced that Trainium3 will be coming later next year.

What it means for customers

AWS’ advancements are not just technical achievements. They’re a blueprint for the future of cloud computing. Trainium2, SageMaker, Bedrock and Capacity Blocks collectively lower the barriers to entry for enterprises seeking to harness AI. Brown’s advice to customers is simple: “Get hands on keyboard. Start small, experiment, and scale from there.”

Final thoughts: AWS infrastructure AI evolution

AWS’ Compute division is navigating a pivotal moment in the tech industry. With generative AI redefining what’s possible, the company’s investments in custom silicon, scalable infrastructure, and customer-centric solutions give AWS a strong hand in leading the next wave of cloud innovation.

As AWS looks to the future, the focus remains on delivering unparalleled performance at the right price. “We’re running as fast as we can,” he said. “The opportunity to innovate for our customers is enormous.”

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.