Ampere to integrate its CPUs with Qualcomm’s cloud-based AI chips for running large models
Chipmaker Ampere Computing LLC, which makes powerful, Arm-based computer chips for smartphones and data center servers, said today it’s partnering with Qualcomm Inc. on a new joint solution that will enable enhanced artificial intelligence inference at lower costs.
The new partnership was announced today during an annual update (below) that saw Ampere detail its 2024 product roadmap, which is focused on advancing high performance, power-efficient computing for cloud and AI workloads. Aside from the partnership with Qualcomm, one of the major highlights to look forward to is the launch of its new AmpereOne platform, with the first product being a 12-channel, 256-core central processing unit built on an advanced N3 process node.
Ampere is an agnostic, Arm-based chip developer that has forged strong partnerships with public cloud infrastructure giants such as Amazon Web Services Inc., Google Cloud and Microsoft Corp., helping them to design their own, proprietary data center processors. For instance, Ampere worked with AWS to design that company’s Graviton chips.
The company’s main backer is another cloud infrastructure provider, Oracle Corp., which has invested more than $400 million in the company. Oracle was one of the first cloud companies to adopt Ampere’s Altra central processing units in an effort to increase its competitiveness in the cloud computing industry.
Ampere has seen a lot of success in the cloud, in contrast to other Arm chip manufacturers such as Qualcomm, Marvell Technologies Inc., Advanced Micro Devices Inc. and Samsung Electronics Co. Ltd., which all have so far failed to make much impact in the market for Arm-based data center chips.
Ampere’s Altra CPUs are said to be customized to run real-time AI workloads such as chatbots, data analytics and video content analysis, offering rapid inference capabilities at a fraction of the cost of Nvidia Corp.’s powerful graphics processing units, which power the majority of AI applications today.
In today’s update, Ampere explained that it will work with Qualcomm to create a joint, cloud-based AI processor that combines its most advanced Ampere Altra CPUs with Qualcomm’s low-powered Cloud AI100Ultra AI inference cards for data center servers.
According to Ampere Chief Product Officer Jeff Wittich, the new offering is designed to enable large language model inference for some of the most powerful generative AI models currently available. “We’ve taken our Ampere CPUs and paired them with AI100 Ultra accelerators from Qualcomm,” he told SiliconANGLE in an interview. “They will be deployed via Supermicro servers initially and eventually come to others.”
Wittich explained that AI is driving enormous demand for lower-cost data center processors, as companies are becoming increasingly concerned about the high costs of running workloads on GPUs, which are also becoming more difficult to procure.
“For AI training, most companies have GPUs and that’s fine, but they’re too power-hungry and expensive for inference workloads,” Wittich explained. “A lot of companies are saying they cannot continue to deploy lots of high-powered and expensive GPUs. We can help solve that problem.”
Wittich said CPUs are ideal for many LLM models, with Ampere’s regular chips more than able to cater to the smaller ones of 7 billion parameters. And soon, they’ll also be able to handle much bigger LLMs of up to 70-billion parameters. “For those, that’s where Qualcomm’s solutions come in,” he said.
The combined Ampere/Qualcomm offering will be enhanced by the launch of Ampere’s most advanced AmpereOne CPU, which the company claims can deliver 40% greater performance than any existing CPU available today, even without any exotic platform designs. The 192-core, 12-channel memory chip will go into production soon and is expected to become available later this year.
Ampere said its efforts in the AI industry are being validated by strong customer adoption. One of its newest customers is Meta Platforms Inc., which is now running its Llama 3 LLM on Ampere CPUs in Oracle’s cloud platform. In an update, the company showed data that illustrates how Llama 3 running on a 128-core Ampere Altra chip with no GPU delivers the same performance as an Nvidia A10 GPU paired with an x86 CPU, while using two-thirds less energy.
Last year, Ampere announced that it had become a founding member of the new AI Platform Alliance, which sees a number of chipmakers combine their expertise to develop more sophisticated platforms for AI compute. The Alliance’s latest initiative will see Ampere leverage the open interface technology in its Ampere chips to incorporate rival chipmakers’ intellectual property into future CPUs. This suggests that the collaboration with Qualcomm might just be the first of many more in the coming years.
Ampere Chief Executive Renee James said the increasing power requirements and energy challenges associated with AI mean that the company’s low-power Arm chip designs are becoming more relevant than ever.
“We started down this path six years ago because it is clear it is the right path,” she said. “Low power used to be synonymous with low performance. But Ampere has proven that isn’t true. We have pioneered the efficiency frontier of computing and delivered performance beyond legacy CPUs in an efficient computing envelope.”
James said the use of power-hungry GPUs for running AI workloads is unsustainable in the long term, as the demands of AI models grow exponentially. What’s more, Nvidia is reportedly struggling to keep up with enterprise demand for its specialized processors. “We believe that the future data center infrastructure has to consider how we retrofit existing air-cooled environments with upgraded compute,” she said.
In addition, James stated her belief that the industry needs to build more sustainable data centers that do not put excessive strain on existing energy grids. “That is what we enable at Ampere,” she said.
With reporting from Robert Hof
Image: Ampere
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU