UPDATED 09:00 EDT / MAY 22 2017

BIG DATA

Pentaho bids to bring Apache Spark to the masses

Data integration and analytics supplier Pentaho, a subsidiary of Hitachi Group Co., is throwing its arms around Apache Spark in the new release of its Pentaho Business Analytics product.

Pentaho said that with the 7.1 version, it’s the first data integration provider to offer “adaptive execution” on any engine for big data processing, with Spark the first platform supported. Apache Flink support is coming soon and other platforms are on the way.

This release also expands cloud integration with Microsoft’s Azure HDInsight cloud-based Hadoop offering, enterprise-level security for Hortonworks Inc. environments and improved in-line visualizations.

Pentaho executives positioned the announcement as a salve for the shortage of big data developers that they said is limiting the adoption of Spark. “We see Spark where Hadoop was three to five years ago. In order to work with it you need to be a developer,” said to Arik Pelkey, senior director of product marketing at Pentaho.

With the latest revisions “we’re running our full suite of visual transformation against Spark,” Pelkey said. The company is doing this using something it calls an adaptive execution layer which automatically maps data integration logic to the execution environment.

In contrast, the company said, other data integrators require users to create Spark-specific data integration logic, which often requires Java programming skills. Pentaho executives said their approach will enable many other execution frameworks to be accommodated in the future. It will also reduce debugging and rework time by guaranteeing compatibility.

“We’re making big-data developers more productive because they now don’t have to regression-test their code to make it work,” Pelkey said. “We’re expanding the range of people who can work with Spark.”

Support for HDInsight basically mirrors the functionality that Pentaho already provides for Amazon Web Services Inc.’s cloud platform. “We’re supporting virtually all the same capabilities that we do with AWS, not just to connect to data but to run big data processing jobs in the cloud and work with a variety of the ecosystem components,” said Ben Hopkins, a senior product manager. The company’s engine enables integration projects to be split between cloud and on-premise data for efficiency and minimal latency. “You can process Salesforce data in the cloud and process SAP data on prem in the same job,” Pelkey said. “You can process that data where it lives. ”

The new security features for Hortonworks environments also duplicate existing functionality the company offers for Cloudera Inc. environments. That includes Kerberos impersonation, which protects against cluster intrusions by creating a one-to-one relationship between a user working in the cluster and one working with Pentaho. The company is also adding support for the Apache Ranger Hadoop security framework to Hortonworks.

Image: Flickr CC

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Pentaho bids to bring Apache Spark to the masses

Image: Flickr CC

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Pentaho bids to bring Apache Spark to the masses

Image: Flickr CC

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

MWC Barcelona 2026

Vast Forward 2026

CES 2026

AWS re:Invent 2025

Microsoft Ignite 2025

Cookies