UPDATED 13:10 EDT / JANUARY 13 2023

AI

HPE acquires AI startup Pachyderm

Hewlett Packard Enterprise Co. is acquiring Pachyderm Inc., a startup with a software platform designed to speed up artificial intelligence projects.

HPE announced the transaction on Thursday. It’s expected to close by the end of the month, after which HPE will integrate Pachyderm’s platform with its AI software portfolio. San Francisco-based Pachyderm previously raised $28.1 million from investors.

Enterprise software teams develop AI models with the help of training datasets. After a new neural network is built, it’s given the task of analyzing a training dataset until it learns to identify patterns of interest in the information. Once the neural network achieves a sufficiently high level of accuracy, it’s deployed in production to process live information. 

The training datasets that engineers use to hone AI models’ accuracy often can’t be processed in their original form. Before deploying a training dataset, software teams have to filter any duplicate and erroneous records it may contain. The preparation process often also includes other tasks, such as turning the information into a form that can be processed using less hardware. 

The process of preparing AI training datasets is performed with automated workflows known as data pipelines. Pachyderm offers a platform that makes it easier to build them. The platform can run on the major public cloud platforms, as well as companies’ on-premises infrastructure. 

Pachyderm enables developers to write scripts that automate individual data preparation tasks such as duplicate record removal. Developers can then combine those scripts into a data pipeline. It runs pipelines using the Kubernetes container orchestration engine, which enables it to add or remove hardware resources automatically according to an AI project’s requirements. 

The startup says its platform can process upwards of terabytes of data per AI project. The platform is capable of ingesting structured information such as spreadsheets, as well as server logs and other types of files. 

Pachyderm creates a record of the changes that data pipelines make to the information they ingest. By evaluating this record, engineers can identify potential technical issues in a pipeline. The company says its platform also provides the ability to reproduce the results of past AI projects, which makes it easier to check their accuracy. 

“As AI projects become larger and increasingly involve complex data sets, data scientists will need reproducible AI solutions to efficiently maximize their machine learning initiatives, optimize their infrastructure cost, and ensure data is reliable and safe no matter where they are in their AI journey,” said Justin Hotard, executive vice president and general manager of HPE’s high-performance computing and AI division. 

HPE plans to integrate Pachyderm with its Machine Learning Development System, a software platform for training AI models. The platform is based on technology that HPE obtained through an earlier startup acquisition of Determined AI.

Photo: HPE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU