UPDATED 11:00 EST / NOVEMBER 12 2024

CLOUD

Cast AI introduces AI Enabler and zero-downtime live migration for Kubernetes workloads

Kubernetes operations and cost management startup Cast AI Group Inc. today announced two new capabilities designed to optimize cloud infrastructure costs and streamline workloads.

First up from Cast AI today is AI Enabler, a new tool designed to optimize the deployment of large language models while reducing operational costs. The tool leverages its Kubernetes infrastructure capabilities to intelligently route queries to the most efficient LLMs, whether open-source or commercial, to ensure cost-effectiveness without compromising on quality.

AI Enabler addresses the challenge organizations face in selecting the best-fit LLMs amid the ever-increasing landscape of models. Traditionally, infrastructure teams have relied on manual processes to identify optimal models, often defaulting to expensive options. By automating this process, AI Enabler reduces the complexity and costs associated with scaling up AI operations.

One of the tool’s notable features is an intelligent LLM router that dynamically selects the most cost-efficient model for each query. The approach, combined with detailed cost insights and real-time reporting, allows businesses to manage LLM expenses more effectively, Cast AI says. With LLM router, users can benchmark and customize configurations in its Playground, optimizing performance without the need for code adjustments.

“Our customers have been asking for a way to harness the power of LLMs without the prohibitive costs of the most popular models,” said co-founder and Chief Product Officer Laurent Gil. “With automated model selection and the ability to launch models locally on spot GPUs, we’ve made large-scale LLM deployment feasible for companies who need real-time insights without the high price tag.”

The second release today sees the introduction of the Commercially Supported Container Live Migration feature to Cast AI’s platform, a feature designed to ensure zero-downtime migrations for stateful workloads on Kubernetes.

The new solution addresses the challenges of moving critical applications like databases and artificial intelligence and machine learning jobs without disrupting operations. Through automating the migration process, organizations can undertake transitions while maintaining continuous uptime.

The feature is particularly helpful for businesses running resource-intensive applications that cannot afford downtime. Traditionally, migrating such workloads required shutting down services, which led to costly interruptions. Now, with Cast AI’s solution, businesses can undertake uninterrupted migration and optimize infrastructure usage by consolidating workloads into fewer, cost-efficient nodes.

Container Live Migration integrates with Cast AI’s existing suite of automation tools, such as Bin-Packing, Cluster Rebalancing and Spot Fallback. The capabilities help reduce resource fragmentation to ensure that applications run efficiently on optimized nodes and allow users to confidently deploy stateful workloads with minimal resource waste and greater cost savings.

Image: SiliconANGLE/Ideogram

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU