UPDATED 15:57 EST / JUNE 05 2023

CLOUD

Freed from the corporate data center, cloud-based GPUs come of age

The time has come for running graphic processing unit chips in the cloud.

Virtual machines running GPUs are increasingly being offered by the cloud platforms, including Amazon Web Services Inc.’s EC2 P4d, Microsoft Corp.’s Azure N series and Google LLC’s Compute Engine. Some of these instances are very high-powered, designed for the most demanding applications, including machine learning and artificial intelligence and supporting other computational-intensive workloads such as 3D rendering and modeling.

And with the recent announcement about a new partnership between Microsoft and CoreWeave Inc., there will be further advances to support even higher-end cloud instances.

CoreWeave is just one of many GPU-centric cloud providers. The field is growing and includes Cirrascale Cloud Services LLC, Lambda Inc., Ace Cloud Hosting Inc., Linode LLC (now owned by Akamai Inc.) and Datacrunch Oy. Indeed, there’s a curated collection of providers with useful cost comparisons for more than a dozen cloud platforms and a wide selection of Nvidia Corp.’s GPU types.

Behind the cloud GPU push

These cloud GPU providers are sitting at the center of several megatrends: the AI and machine learning boom, the commoditization of GPU-aware apps, and the overall push to rent rather than buy expensive computing resources.

Cloud-based GPUs have come of age for several reasons. First, they save on the capital cost by renting, not owning, these expensive pieces of hardware. They are highly scalable, which also plays into the cloud model since new GPUs can be added to and subtracted from compute instances with just a few clicks.

One of the reasons for their heavy-duty processing power is that GPUs consist of thousands of processor cores, so they can divide and conquer performance-intensive tasks. The typical CPU, in contrast, has perhaps a dozen or so cores.

GPUs are also very important in machine learning applications that typically require parallel processing and moving massive amounts of data around. The two leading machine learning development environments, Google’s TensorFlow and Meta Platforms Inc.’s PyTorch, both support GPU-based apps. GPUs have also become fixtures in the world’s biggest computers that reside in such places as the Lawrence Livermore Labs and in Chinese universities and are catalogued in the current TOP500 list of the most powerful computers, which shows the growth in GPU popularity.

Having them in these high-end machines has motivated app developers to expand their reach. And Nvidia has been on task by working with developers to create apps specific to its GPU line that are optimized to take advantage of the additional processors.

It has a collection of industry-specific software development kits, such as for large language models, image processing, machine-based language translation, speech recognition and high-performance computing. What about Advanced Micro Devices Inc.-based GPUs? They have been trailing, although GPUEater has a series of cloud instances running their Radeon graphics cards.

A few drawbacks

Though impressive, the Nvidia catalog isn’t encyclopedic, and businesses that have developed their own apps might have issues getting them to support the additional GPU horsepower. Another drawback is doing a direct cost comparison with on-premises equipment. “Some businesses may assume that GPU cloud computing is always cheaper than on-premise computing, but this is not always the case,” SatoshiSpain SL wrote in a blog post.

Although all the cloud providers offer pricing details on their websites, figuring out a monthly cloud bill in advance isn’t always easy, given that the prices are based on usage of numerous resources. And finally, there are numerous Nvidia GPU processors, each with its own benefits and features, so developers have to understand what hardware is offered by a potential cloud provider.

Image: JacekAbramowicz/Pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU