UPDATED 09:00 EDT / MARCH 14 2017

BIG DATA

Pentaho pitches its integration platform as a machine learning aid

Pentaho Corp. is broadening the scope of its orchestration capabilities to include machine learning, saying its toolset can help teams of data scientists, engineers and analysts to train, tune, test and deploy predictive models in a fraction of the time typically required.

Pentaho said its combined data integration and analytics platform enables predictive models to be deployed more quickly, regardless of use, industry or whether models are built in R, Python, Scala or Weka. The announcement amounts to a repositioning of the existing Pentaho 7.0 platform for a new audience. “We haven’t really been targeting that community in the past, but it makes sense for us to speak to data scientists,” said Arik Pelkey, senior director of product marketing.

Building predictive machine learning models is a chore because workflows must be defined for every data source and because most models don’t transition smoothly into production, said Wael Elrifai, director of enterprise solutions for Pentaho’s Europe/Middle East/Africa region. “If a train operator wants to predict where failures will occur and has 3,000 sensors generating 4 million data points per second, the data scientists need to write 3,000 workflows,” he said. “We can do all of these at a high level” using drag-and-drop metaphors.

Pentaho says it can bridge the gap between predictive models, which are typically captured in notebooks, and operational data flows. When building in Pentaho, “90 percent of your feature engineering ends up being part of production workflow,” Elrifai said. “Your feature problems are part of your operational model as well.”

The task of building predictive models is frustrated by silos, which inhibit cross-functional workflow, the company said. Ventana Research Inc. has said that 92 percent of organizations plan to increase their use of predictive analytics, but half have difficulty integrating predictive models into existing architectures.

Pentaho is attacking this problem by making it easier to preserve the work that goes into building models as they transition into operation. Data scientists and engineers can use the platform to blend traditional sources such as enterprise resource planning, enterprise asset management and unstructured data sources in an automated process that combines data on-boarding, data transformation and data validation.

With integrations for languages such as R and Python, and for machine learning packages including Spark MLlib and Weka, Pentaho said it enables data scientists to train, tune, build and test models faster. Models developed by data scientists can then be embedded directly in a data workflow, thereby leveraging existing data and feature engineering efforts.

Data engineers and scientists can also re-train existing models with new data sets or make feature updates using custom execution steps. Prebuilt workflows can automatically update models and archive existing ones. Enhancements in version 7.0 enable visual debugging of data transformation processes, which can also be applied to machine learning models.

Photo: Clever Cogs! via photopin (license)

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Pentaho pitches its integration platform as a machine learning aid

Photo: Clever Cogs! via photopin (license)

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

Pentaho pitches its integration platform as a machine learning aid

Photo: Clever Cogs! via photopin (license)

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

Cookies