AWS debuts new compute-intensive and AI instances powered by custom chips
Amazon Web Services Inc. today introduced two new sets of cloud instances, one aimed at compute-intensive applications and the other optimized for artificial intelligence training, that are based on custom chips developed in-house by the cloud giant.
AWS’ new Amazon EC2 C7g instances target compute-intensive workloads such as analytics tools and scientific modeling software. The C7g series is based on the AWS Graviton3 chip, the third interaction of the cloud giant’s internally-designed processor.
The Amazon.com Inc. unit today also expanded its compute portfolio with the introduction of the Amazon EC2 Trn1 instances for training AI models, which run on a likewise custom chip called AWS Trainium. All the new instances made their debut at the cloud giant’s annual AWS re:Invent event this morning.
AWS has been investing in custom semiconductor development over recent years to provide more silicon options for its customers. The cloud giant also sources chips from multiple external suppliers. As a result, organizations using the Amazon unit’s cloud today have access to instances powered by silicon from Intel Corp., Advanced Micro Devices Inc., Nvidia Corp. and AWS itself.
The C7g instance series runs on one of the latest fruits of AWS’ semiconductor development efforts: the newly detailed Graviton3 processor. AWS estimates that the chip will deliver as much as 25% higher performance than its predecessor, the Graviton2.
AI and security optimizations
Companies that plan to use C7g instances for machine learning workloads can expect an up to threefold performance jump in AI performance. The speed jump is partly the result of the processor’s support for bfloat16, a specialized data format.
The bfloat16 format is used to store floating point numbers, the basic unit of information that AI models and certain other applications work with. Industry adoption of bfloat16 has been growing because the technology enables applications to process floating point numbers faster and using less memory than FP32, the data format historically used for the task. Besides AWS, Intel has also implemented the technology in some of its chips.
For cryptographic tasks, Graviton3 delivers as much as double the performance of the Graviton2. Encrypting and decrypting data can require a significant amount of infrastructure resources because the process involves specialized mathematical operations. The faster a processor can perform cryptographic computations, the faster enterprise applications run.
Increased performance is not the only selling point of AWS’ new Graviton3-powered EC2 C7g instances. The chip introduces a feature called pointer authentication to reduce the risk of cyberattacks. Some types of malware attempt to carry out cyberattacks by overwriting parts of a cloud instance’s memory with malicious code. AWS’ pointer authentication technology can detect malicious attempts to overwrite memory and block them.
C7g instances will be available in multiple configurations including bare-metal versions. According to AWS, the series is the first in the cloud industry to feature DDR5 memory, the most sophisticated type of memory on the market today. DDR5 memory offers 50% higher bandwidth than the DDR4 drives used in the current generation of EC2 instances, AWS says.
“Powered by new Graviton3 processors, these instances are going to be a great match for your compute-intensive workloads: HPC, batch processing, electronic design automation (EDA), media encoding, scientific modeling, ad serving, distributed analytics, and CPU-based machine learning inferencing,” AWS Chief Evangelist Jeff Barr wrote in a blog post.
Faster AI training
AWS debuted the C7g series alongside a second, specialized line of cloud instances called Trn1. Instances in the Trn1 series are optimized for the task of training artificial intelligence models. The series runs on a custom chip called Trainium that AWS engineers have developed specifically for AI training use cases.
A single Trn1 instance can be provisioned with as many as 16 Trainium chips. Companies that require additional performance have the option to deploy multiple instances in a cluster. According to AWS, organizations can provision Trn1 clusters with as many as thousands of Trainium chips and connect them using “petabit scale, non-blocking networking.”
Each instance comes with up to 800 gigabits per second of network throughput, twice as much as AWS’ GPU-based instances. The large amount of throughput allows for the rapid transfer of data between different parts of a company’s cloud environment. According to AWS, the Trn1 series’ Trainium chips, high bandwidth and other features allow it to offer the “best price performance for training deep learning models” in the cloud.
Training large-scale AI models requires a significant amount of infrastructure. According to an analysis by OpenAI, the amount of compute resources used in the largest AI training runs grew by a factor of about 300,000 between 2012 and 2018. The steady increase in the infrastructure requirements of AI projects means that AWS can target a large and growing market with Trn1.
Both the Trn1 series and the Graviton3-powered C7g instances are currently in preview.
Today’s new instance announcements follow AWS launching another instance line, the Amazon EC2 M6a series, into general availability on Monday. The M6a series is powered by chips from AMD’s latest 3rd Gen EPYC portfolio of central processing units for servers. AWS says that M6a instances are available at a 10 percent lower cost than comparable instances based on Intel silicon.
Depending on their performance requirements, customers can provision M6a instances with anywhere from two to 192 vCPUs. The maximum amount of memory available for an instance is 768 gibibytes. AWS designed the M6a series for general-purpose workloads such as databases, SAP SE applications and software development environments.
Image: AWS
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU