UPDATED 21:29 EST / FEBRUARY 19 2025

AI

Baseten grabs $75M to crank up high-performance inference for AI workloads

BaseTen Labs Inc., an artificial intelligence startup that’s focused on high-performance inference for large language models and other AI applications, said today it has closed on a $75 million Series C funding round led by IVP and Spark Capital.

The round, which also saw participation from Greylock, Conviction, South Park Commons, 01 Advisors and Lachy Groom, brings the startup’s total amount raised to $135 million.

Baseten has created an AI inference platform that’s used by enterprises to run LLMs either in the cloud or on their own, on-premises infrastructure. For AI applications to scale, they need access to extremely fast and reliable inference, which refers to the process of querying models and computing a response.

Yet for many organizations, this has proven challenging, with the required high-performance graphics processing units that power inference often hard to come by. Even the best-funded companies can struggle to access these resources, and if they can’t find enough, the result will be poorly performing applications and, occasionally, even downtime. Add to that, the shortage of GPUs can also mean paying inflated costs.

Instead of operating its own data centers, Baseten relies on public cloud infrastructure from providers such as Amazon Web Services Inc., Google Cloud and Microsoft Corp. By combining the resources of these cloud platforms, it says, it can provide better access to GPUs. In addition, customers can also run the company’s software in their own data centers.

Baseten’s platform provides everything required to get high-performance inference up and running. That includes a vast library of proprietary and open-source models, modern tooling and workflows for deploying, managing, scaling and maintaining LLMs in production, and the multicluster, multicloud infrastructure needed to scale across regions and AI model modalities.

It also provides access to applied research, so customers can use the latest techniques and frameworks to squeeze better performance and cost-efficiency out of their AI applications. There are also specialist AI engineers to assist customers in deploying those apps.

In an interview with CNBC, Baseten’s co-founder and Chief Executive Tuhin Srivastava said one of the main advantages his company provides is that it can guarantee access to GPU resources. While many companies do deploy their AI models alone, most will struggle to get enough GPUs in the right geographical location, he said.

In addition, customers also experience frustration with last minute warnings that some of the GPUs they’re using will be moved into maintenance mode, meaning they become unavailable at the drop of a hat. Baseten has enough resources to avoid these kinds of disruptions, Srivastava said.

“In this market your No. 1 differentiation is how fast you can move,” he claimed. “That is the core benefit for our customers.”

In addition, Baseten can also save its customers tons of cash, claiming that the average customer sees its inference costs drop by about 40% once they start using its services, in addition to better performance.

That might explain how the company has been able to grow its revenue by more than six times in the last 12 months, though it didn’t provide any concrete numbers regarding its sales.

On the other hand, it did claim to have more than 100 enterprise customers on its books, including the crowdfunding platform Patreon Inc., the AI startup Writer Inc. and the video editing company Descript Inc.

Srivastava said his customers prioritize the ability to bring high-quality AI products to market quickly, and choose Baseten to ensure that happens. “Speed, reliability and cost-efficiency are non-negotiables, and that’s where we devote 100% of our focus,” he added.

Spark Capital General Partner Will Reed said that if an AI product hasn’t yet experienced problems with inference, it’s because it hasn’t managed to hit any real scale thus far.

“Every successful AI project needs exceptional inference performance, or nobody wants to use it,” he explained. “If you’re betting the future of your product or company on that performance, choosing the right partner is make or break.”

Image: SiliconANGLE/Meta AI

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.