New IBM and Linux Foundation toolkit makes AI projects more organized

IBM Corp. has teamed up with the The LF AI & Data Foundation, an open-source group operating as part of the Linux Foundation, to launch a new toolkit for managing enterprise machine learning projects.

The toolkit is called the Machine Learning eXchange and made its debut today under an open-source license. According to IBM, the software aims to address a common challenge for enterprises developing artificial intelligence applications: duplicate work.

An AI model has numerous individual components. There’s the neural network that forms the basis of the AI model, which developers either obtain from the open-source ecosystem or build from scratch. There’s the training dataset that developers use to hone the neural network’s accuracy and speed. Finally, software teams must create an entire array of auxiliary components and custom scripts to handle the tasks involved in deploying a newly created AI model to production.

At a large company working on multiple AI models at the same time, at least some code components can be reused across projects. In practice, however, that doesn’t always happen because of organizational silos. Developers can find themselves having to reinvent the wheel and develop their own versions of AI components that were already created earlier by a team at a different business unit.

Machine Learning eXchange is designed to reduce duplicate work for organizations adopting AI. According to IBM, the toolkit enables enterprises to set up a central hub for sharing machine learning components across development teams.

A company’s internal Machine Learning eXchange deployment can store AI models and datasets. Additionally, the toolkit lends itself to sharing Jupyter Notebooks, which are scripts that developers use to perform tasks such as checking a neural network for bugs. 

Perhaps most important, Machine Learning eXchange provides features that make it possible to reuse AI pipelines across teams. An AI pipeline is the collection of software components responsible for turning an untrained neural network into an enterprise-grade AI capable of running in production. The exact task that each software component performs varies between projects. 

An AI pipeline can be configured to enrich the training dataset used to develop a neural network with information from additional sources. Then, after training is complete, the AI pipeline might run a series of pre-programmed tests to determine if the neural network is producing accurate results. Depending on a project’s requirement, AI pipelines can be used to manage other important tasks as well, which makes them an essential component of machine learning initiatives.

Centralizing a company’s AI components in a single platform allows developers to find code developed by colleagues and avoid duplicate work. Moreover, IBM says, there’s a second benefit: easier governance. When everything is stored in one place, companies have an easier time ensuring that cybersecurity rules and other policies are being met. Enforcing governance policies consistently is more difficult when AI components are scattered across multiple teams.

Machine Learning eXchange includes several tools from the open-source ecosystem to simplify the logistics of deploying users’ AI models and pipelines. The main focus is making it easier to run machine learning software on Kubernetes. 

Companies using Kubernetes deploy their software not as one monolithic file, as was once standard practice, but as a collection of interconnected software containers. Making the right AI dataset available to the container comes with some technical challenges. Machine Learning eXchange uses an open-source framework called Datashim to ease AI data access tasks, which means developers have to spend less time manually defining configuration settings.

Also included are the Kubeflow Pipelines on Tekton and KFServing open-source tools, IBM says. The two tools simplify several of the key tasks involved in running AI models on production. The result: Software teams can spend more of their time on writing code.

Making AI models easier to build has become a major theme in the enterprise software market amid companies’ increasing adoption of machine learning. DataRobot Inc., a startup with a platform that simplifies many of the tasks involved in AI development, recently raised funding at a $6.3 billion valuation. The major public cloud providers, in turn, have all introduced so-called AutoML offerings that automate previously manual aspects of machine learning projects.

The growing demand for AI tools stems not only from the increase in the number of companies adopting machine learning, but also from the fact that firms with existing AI investments are rolling out the technology to more business units. As more enterprises find themselves in a situation where multiple business units are pursuing separate AI initiatives, priorities such as reducing duplicate work should become a bigger focus. The newly launched Machine Learning eXchange will allow IBM to address that requirement for its largest customers. 

Photo: IBM

A message from John Furrier, co-founder of SiliconANGLE:

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Join Our Community 

Click here to join the free and open Startup Showcase event.

“TheCUBE is part of re:Invent, you know, you guys really are a part of the event and we really appreciate your coming here and I know people appreciate the content you create as well” – Andy Jassy

We really want to hear from you, and we’re looking forward to seeing you at the event and in theCUBE Club.

Click here to join the free and open Startup Showcase event.