UPDATED 09:00 EDT / SEPTEMBER 18 2023

AI

Anyscale launches Endpoints, a more cost-effective platform for fine-tuning generative AI models

Anyscale Inc., creator of the open-source distributed computing platform Ray, is launching a new service called Anyscale Endpoints that it says will help developers integrate fast, cost-effective and scalable large language models into their applications.

The service was announced today at the 2023 Ray Summit, the project’s annual user conference, which is targeting generative artificial intelligence developers.

Anyscale’s open-source Python framework Ray is software that’s used to run distributed computing projects powered by multiple cloud servers. It features a universal serverless compute application programming interface and an expanded ecosystem of libraries. Using them together, developers can build scalable applications that run on multicloud platforms without needing to worry about the underlying infrastructure. That’s because Ray eliminates the need for in-house distributed computing expertise.

As for the Anyscale cloud platform, it’s a managed version of Ray that makes the software more accessible. It runs on Amazon Web Services and solves the difficulty of bringing artificial intelligence prototypes built on a laptop to the cloud, where they can be scaled across hundreds of machines.

With the launch of Anyscale Endpoints, developers now have a simple way to build distributed applications that leverage the most advanced generative AI capabilities using the application programming interfaces of popular LLMs such as OpenAI LP’s GPT-4. Doing so was impossible before. The only option was for developers to create their own AI models. They required to assemble their own machine learning pipelines, train AI models from scratch, then securely deploy them at large scale.

Now, Anyscale said, developers can “seamlessly add LLM superpowers to their distributed applications” without needing to build a custom AI platform. What’s more, Anyscale said, they can do so at much lower cost, with Endpoints said to cost less than half the price of comparable proprietary solutions.

Robert Nishihara, co-founder and chief executive at Anyscale, said the infrastructure complexity, compute resources and costs have limited the possibilities of AI for developers. “With seamless access via a simple API to powerful GPUs at a market-leading price, Endpoints lets developers take advantage of open-source LLMs without the complexity of traditional ML infrastructure,” he said.

Anyscale said LLMs provide strong value to applications thanks to their flexibility. They can be fine-tuned with an organization’s own data to perform very specific tasks, acting as a customer service bot or a knowledge base for internal workers, and performing many other jobs.

The company said users will be able to run Endpoints on their existing cloud accounts within AWS or Google Cloud, improving security for activities such as fine-tuning. Customers can also use existing security controls and policies. Endpoints integrates with most popular Python, plus machine learning libraries and frameworks such as Hugging Face and Weights & Biases.

In addition, users who upgrade from the open-source Ray to the full Anyscale AI Application Platform will get better controls to fully customize LLMs, and the ability to deploy dozens of AI-infused applications on the same infrastructure.

Perhaps the most appealing feature is the price, with Endpoints available at a cost of $1 per million tokens for LLMs such as Llama 2, and even less for some other models. Anyscale said this is less than half the cost of other, proprietary AI systems, making LLMs much more accessible to developers.

Anyscale Endpoints is available today, and the company promised it will continue to evolve the service rapidly.

Back in March, Nishihara appeared on theCUBE, SiliconANGLE’s mobile livestreaming studio, where he talked in depth about the Anyscale platform and its ability to simplify AI workloads and harness machine learning frameworks at scale:

Image: Freepik

A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.