UPDATED 15:16 EDT / DECEMBER 16 2021

OctoML introduces ultra-efficient AI models in latest platform release

OctoML Inc. today introduced a new release of its artificial intelligence platform that includes a collection of highly efficient neural networks.

The neural networks are optimized versions of popular open-source AI models that OctoML has fine-tuned. According to the startup, the optimized versions cost less to run than the original AI models and require less power as well.

The new platform release also introduces other improvements, including support for more machine learning development tools.

Seattle-based OctoML launched in 2019 and is backed by more than $130 million in funding. It provides an AI platform that reduces the amount of manual work involved in enterprise machine learning projects. The platform focuses on streamlining two tasks in particular: optimizing AI models and deploying them.

AI model optimization is the process in which a company’s engineers tweak the configuration of a neural network to increase its power efficiency, accuracy, speed or some combination thereof. The task can take weeks in some cases. OctoML’s platform streamlines the task by automatically finding ways of optimizing a neural network’s efficiency.

Deploying AI models once development is complete represents another challenge. Companies have to ensure their neural networks are optimized for the infrastructure on which they run. Limiting infrastructure costs is another priority. OctoML’s platform promises to help with those tasks as well.

The new platform release that the startup introduced today includes optimized versions of several popular AI models. The optimized versions cost half as much to run as the original AI models and use 50% less power, the startup says. As far as performance is concerned, OctoML is promising an average speedup of nearly 300% when using graphics processing units.

The new AI models added to OctoML’s platform target computer vision and natural language processing use cases. The collection includes an optimized version of OpenAI’s GPT-2, the predecessor to GPT-3. Also on the list: BERT, a natural language processing model that Google LLC uses in its search engine.

OctoML enables companies to deploy models they optimize using its platform on multiple types of infrastructure. For enterprises planning to run their software in the cloud, the new platform release adds the ability to deploy AI models to Microsoft Corp.’s Azure platform. The Azure compatibility complements the existing support that OctoML provides for Amazon Web Services and Google Cloud.

To simplify the deployment of AI models on edge devices, OctoML has added support for two additional Nvidia Corp. chips. Customers can now deploy their models on Nvidia’s Jetson AGX Xavier and Jetson Xavier NX modules. The former chip is commonly used in robots, while the latter module targets systems such as medical devices that process large amounts of sensory data.

“Enterprises today face significant challenges with scaling the deployment of their trained models,” said OctoML Chief Executive Officer Luis Ceze. “This is because model performance tuning and optimization is largely done manually. Also, models, software platforms, and inference targets are rapidly evolving, requiring highly skilled resources on an ongoing basis. This latest iteration breaks these bottlenecks, making machine learning economically viable and enabling faster innovation.”

AI developers have an array of open-source tools at their disposal for building neural networks. Often, different companies use different sets of tools to build machine learning software. It’s important for AI software providers to support the open-source technologies that their customers use in their AI projects.

With today’s platform release, OctoML is adding support for a popular open-source AI tool called ONNX. The technology can package a neural network into a portable format that is compatible with many different software components. ONNX’s portability allows developers to easily mix and match the software components they use for an AI project while avoiding compatibility issues.

OctoML’s platform now also works with TensorFlow Lite. It’s a specialized version of TensorFlow, Google’s popular AI development framework, that is designed for building neural networks that run on connected devices.

Photo: OctoML

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

OctoML introduces ultra-efficient AI models in latest platform release

Photo: OctoML

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

OctoML introduces ultra-efficient AI models in latest platform release

Photo: OctoML

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

Cookies