UPDATED 11:00 EST / DECEMBER 02 2025

AI

AWS brings sovereign AI on-prem with new AI Factories alongside Trainium3 and Nvidia GB300 launches

Amazon Web Services Inc. today announced a set of artificial intelligence infrastructure announcements spanning sovereign on-premises deployments, next-generation custom AI accelerators and the most advanced Nvidia Corp. GPU instances yet offered on AWS — all part of a push to dominate both cloud and private AI at large scale.

The announcements included the launch of AWS AI Factories, the general availability of Amazon EC2 Trn3 UltraServers powered by the new Trainium3 chip and the introduction of P6e-GB300 UltraServers featuring Nvidia’s latest Blackwell-based GB300 NVL72 platform.

Leading the announcements is AWS AI Factories, a new offering that delivers dedicated, full-stack AWS AI infrastructure directly inside customers’ existing data centers.

The platform combines Nvidia accelerated computing, AWS Trainium chips, high-speed low-latency networking, energy-efficient infrastructure and core AWS AI services, including Amazon Bedrock and Amazon SageMaker.

AWS AI Factories have been built primarily for governments and regulated industries and operate similarly to a private AWS Region to provide secure, low-latency access to compute, storage and AI services while ensuring strict data sovereignty and regulatory compliance. With the offering, customers can leverage their own facilities, power and network connectivity, while AWS handles deployment, operations and lifecycle management. AWS says that the result accelerates deployment timelines that would normally take years.

As part of the AI Factories announcement, AWS also highlighted its deepening partnership with Nvidia around the platform, including support for Grace Blackwell and future Vera Rubin GPU architectures and future support for Nvidia NVLink Fusion interconnects in Trainium4.

“Large-scale AI requires a full-stack approach — from advanced GPUs and networking to software and services that optimize every layer of the data center,” said Ian Buck, vice president and general manager of Hyperscale and HPC at Nvidia. “Together with AWS, we’re delivering all of this directly into customers’ environments.”

Trainium3 UltraServers

AWS also announced that its Amazon EC2 Trn3 UltraServers, powered by the new three-nanometer Trainium3 AI chip, are now generally available.

Trn3 systems can scale up to 144 Trainium3 chips in a single UltraServer to deliver up to 4.4 times more compute performance, four times greater energy efficiency and nearly four times more memory bandwidth than Trainium2.

The UltraServers are designed for next-generation workloads such as agentic AI, mixture-of-experts models and large-scale reinforcement learning, with AWS-engineered networking that delivers sub-10-microsecond chip-to-chip latency.

In testing using OpenAI Group PBC’s open-weight model GPT-OSS, AWS customers achieved three times higher throughput per chip and four times faster inference response times versus the previous generation. Customers including Anthropic PBC, Karakuri Ltd., Metagenomi Inc., Neto.ai Inc., Ricoh Company Ltd. and Splash Music Inc. are already reporting up to 50% reductions in training and inference costs.

AWS also previewed Trainium4, which is expected to deliver major gains in FP4 and FP8 performance and memory bandwidth.

Nvidia GB300

Rounding out the AI infrastructure announcements, AWS introduced the new P6e-GB300 UltraServers, featuring Nvidia’s GB300 NVL72 platform, making it the most advanced Nvidia GPU architecture available in Amazon EC2.

The instances deliver the highest GPU memory and compute density on AWS, targeting trillion-parameter AI inference and advanced reasoning models in production.

The P6e-GB300 systems run on the AWS Nitro System and integrate tightly with services such as Amazon Elastic Kubernetes Service and, in doing so, allow customers to deploy large-scale inference workloads securely and efficiently.

Photo: Robert Hof/SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.