UPDATED 09:00 EST / AUGUST 13 2024

INFRA

Foundry Technologies launches resellable GPU instances for more cost-effective AI

Cloud computing infrastructure startup Foundry Technologies Inc. is looking to disrupt the artificial intelligence compute industry with the launch of Foundry Cloud Platform, a real-time market and orchestration engine that it says provides simpler and more affordable access to graphics processing units.

The company says Foundry Cloud Platform reduces the operational complexity of renting out GPU servers while boosting cost efficiency by up to six times, making AI development a viable option for more organizations.

The startup, which was founded by alumni from Google LLC’s DeepMind, explains that the hype around AI has made GPUs one of the world’s most sought-after commodities. These days, almost every enterprise is scrambling to integrate AI into its business processes in one way or another, and that has led to demand for GPUs outstripping supply. As a result, the cost of renting GPUs from traditional cloud infrastructure providers has risen beyond what many companies can reasonably afford.

Access to GPUs is further hindered by industry-standard long-term contracts, Foundry says. The problem with these contracts is that many enterprises will overprovision GPU resources to ensure they have access to AI computing power when they need it. But that means there are thousands of GPUs in the world sitting idle.

Foundry founder and Chief Executive Jared Quincy Davis reckons that the cloud-based GPU compute market has become one of the most inefficient commodity markets in history, and it’s having an adverse impact by prohibiting innovations that could benefit society.

“The majority of AI research and development teams struggle to access affordable and reliable compute for their workloads, while exceptionally well-funded organizations are forced to purchase long-term GPU reservations that they rarely utilize to maximize capacity,” Davis said. “Foundry Cloud Platform addresses this market failure by aggregating and redistributing idle compute capacity to enable faster breakthroughs while improving return on GPU investments.”

With the Foundry Cloud Platform, companies have a reliable way to scale their AI projects while optimizing cost efficiency, the startup says. What Foundry does is it aggregates its compute resources into a single, dynamically priced pool, with two options available for customers that want to access its GPU resources.

The first option is to purchase “resellable reserved instances.” Foundry says it provides self-serve access to teams that want to reserve short-term GPU capacity from its pool of virtual machines.

So rather than taking out a fixed, long-term contract as they would do with a traditional cloud infrastructure provider like Amazon Web Services Inc. or Microsoft Azure, customers can reserve interconnected GPU clusters for as little as three hours, the company explained. Once they have paid to reserve this capacity, should they find that they don’t need to use all of the GPUs they have set aside, they will be able to resell the overprovisioned capacity from their reservation, rather than letting it sit idle.

As an example, Foundry said, a customer that reserves 128 Nvidia H100 GPUs and sets aside 16 as healing buffer nodes would then have the option to temporarily relist those nodes on its secondary marketplace. While those GPUs are listed, the customer will generate “credits” until they’re recalled or the initial reservation period ends. Those credits can then be used to reserve instances at a later date.

Foundry says this kind of scenario is ideal for companies running preplanned workloads such as AI training runs, or day-to-day developer tasks such as debugging and verification.

The second option is what Foundry calls “spot instance,” which are the unreserved GPUs and those listed by customers who have previously reserved capacity. These resources are made available via an auction, so customers can bid for GPU access for interruption-tolerant workloads such as model inference, hyperparameter training and fine-tuning.

The spot instance marketplace uses what Foundry calls “auction theory” to price its reserved and spot instances dynamically based on the current level of supply and demand. If prices become too high, Foundry promises to invest in more GPUs to ensure it always has enough on hand to satisfy customer’s needs and ensure price stability.

Holger Mueller of Constellation Research Inc. said AI team decision makers will be following Foundry’s launch and progress with interest, as there is a lot of demand for more affordable and flexible GPU access.

“GPU resources and hard to come by and very costly, and so I’m not surprised to see some innovation in this area, with Foundry offering better utilization terms and pricing via its cloud platform,” the analyst said. “The challenge for Foundry is to provide enough capacity at a game-changing cost level. And if it does that, the next question will be, can it force the larger cloud vendors to follow up with a similar model?”

An interesting feature of Foundry’s GPU cloud is that it supports Kubernetes workload orchestration, eliminating the need to schedule AI workloads manually. Instead, users can programmatically add reserve and spot instances to a managed Kubernetes cluster. In this way, Foundry claims it can scale capacity horizontally to help AI teams optimize price-performance and minimize inference latency during traffic spikes.

Another benefit of Foundry’s platform is that it gives AI teams more room to experiment with different kinds of GPUs. “Because we aren’t locked into a long-term contract, we have the flexibility to experiment with a variety of GPUs and empirically determine how to get the best price-performance for our workload,” said Matt Wheeler, an engineer at the AI startup Infinite Monkey.

Image: Foundry Technologies

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Foundry Technologies launches resellable GPU instances for more cost-effective AI

Image: Foundry Technologies

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Microsoft Ignite 2025

SC25

Refresh North America 2025

QAD Champions of Manufacturing 2025

Agentic AI Unleashed: The Future of Digital & IT Operations 2025

Foundry Technologies launches resellable GPU instances for more cost-effective AI

Image: Foundry Technologies

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Microsoft Ignite 2025

SC25

Refresh North America 2025

QAD Champions of Manufacturing 2025

Agentic AI Unleashed: The Future of Digital & IT Operations 2025

Cookies