UPDATED 11:00 EST / NOVEMBER 12 2024

CLOUD

Cast AI introduces AI Enabler and zero-downtime live migration for Kubernetes workloads

Kubernetes operations and cost management startup Cast AI Group Inc. today announced two new capabilities designed to optimize cloud infrastructure costs and streamline workloads.

First up from Cast AI today is AI Enabler, a new tool designed to optimize the deployment of large language models while reducing operational costs. The tool leverages its Kubernetes infrastructure capabilities to intelligently route queries to the most efficient LLMs, whether open-source or commercial, to ensure cost-effectiveness without compromising on quality.

AI Enabler addresses the challenge organizations face in selecting the best-fit LLMs amid the ever-increasing landscape of models. Traditionally, infrastructure teams have relied on manual processes to identify optimal models, often defaulting to expensive options. By automating this process, AI Enabler reduces the complexity and costs associated with scaling up AI operations.

One of the tool’s notable features is an intelligent LLM router that dynamically selects the most cost-efficient model for each query. The approach, combined with detailed cost insights and real-time reporting, allows businesses to manage LLM expenses more effectively, Cast AI says. With LLM router, users can benchmark and customize configurations in its Playground, optimizing performance without the need for code adjustments.

“Our customers have been asking for a way to harness the power of LLMs without the prohibitive costs of the most popular models,” said co-founder and Chief Product Officer Laurent Gil. “With automated model selection and the ability to launch models locally on spot GPUs, we’ve made large-scale LLM deployment feasible for companies who need real-time insights without the high price tag.”

The second release today sees the introduction of the Commercially Supported Container Live Migration feature to Cast AI’s platform, a feature designed to ensure zero-downtime migrations for stateful workloads on Kubernetes.

The new solution addresses the challenges of moving critical applications like databases and artificial intelligence and machine learning jobs without disrupting operations. Through automating the migration process, organizations can undertake transitions while maintaining continuous uptime.

The feature is particularly helpful for businesses running resource-intensive applications that cannot afford downtime. Traditionally, migrating such workloads required shutting down services, which led to costly interruptions. Now, with Cast AI’s solution, businesses can undertake uninterrupted migration and optimize infrastructure usage by consolidating workloads into fewer, cost-efficient nodes.

Container Live Migration integrates with Cast AI’s existing suite of automation tools, such as Bin-Packing, Cluster Rebalancing and Spot Fallback. The capabilities help reduce resource fragmentation to ensure that applications run efficiently on optimized nodes and allow users to confidently deploy stateful workloads with minimal resource waste and greater cost savings.

Image: SiliconANGLE/Ideogram

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Cast AI introduces AI Enabler and zero-downtime live migration for Kubernetes workloads

Image: SiliconANGLE/Ideogram

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Microsoft Ignite 2025

SC25

Refresh North America 2025

QAD Champions of Manufacturing 2025

Agentic AI Unleashed: The Future of Digital & IT Operations 2025

Cast AI introduces AI Enabler and zero-downtime live migration for Kubernetes workloads

Image: SiliconANGLE/Ideogram

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Microsoft Ignite 2025

SC25

Refresh North America 2025

QAD Champions of Manufacturing 2025

Agentic AI Unleashed: The Future of Digital & IT Operations 2025

Cookies