Hortonworks bids for Spark leadership
Citing enterprise customers’ need for stability and predictability, Hortonworks Inc. said it’s changing the distribution schedule for its core Apache Hadoop platform and extended services. It’s also placing big new bets on the race to improve the performance and scalability of Apache Spark.
Core Apache Hadoop components such as HDFS, MapReduce, Yarn and Apache Zookeeper will now be updated annually and aligned with the schedules of other members of the Open Data Platform (ODP) consortium, of which Hortonworks is a founding member.
Extended services like Spark, Hive, HBase, Ambari and others, which run on top of the core platforms, will be logically grouped together and released continually throughout the year on a schedule that’s roughly in line with that of community development, Hortonworks said. The addition of support for Apache Ambari in the new release of Hortonworks DataFlow 1.2 make it possible for users to install upgrades at will with minimal cluster downtime.
The latest release of Hortonworks’ Hadoop 2.4 platform is said to be the first to include support of Apache Spark 1.6. The new Hadoop engine is the foundation for “an entirely new way to manage data in motion and data at rest,” said Matt Morgan, vice president of product and alliance marketing. “The integration of both data in motion and data at rest provide the ability to build these modern apps that can manage data-at-rest at scale while capturing information at the jagged edge.”
DataFlow 1.2, which is based upon technology it acquired with Onyara Inc., is the centerpiece of Hortonworks’ “data in motion” play (see diagram above). Onyara was the commercial developer of the Apache NiFi enterprise integration and dataflow automation tool. NiFi’s large ecosystem of third-party developers encompasses more than 130 processors, including Kafka, Couchbase, Microsoft Azure Event Hub and Splunk.
DataFlow 1.2 adds support for Apache Kafka and Apache Storm, which collectively enable users to collect, analyze and move data from any source to any destination, Hortonworks said. A partnership with Impetus Technologies Inc. adds support for StreamAnalytix, a tool for designing and monitoring pipelines for streaming data. Hortonworks is also releasing a preview version of Apache Zeppelin, which Hortonworks President Herb Cunita likened to “Tableau for Spark” in a reference to the popular visualization engine from Tableau Software Inc.
Hortonworks hopes its new Spark thrust will position it as a leader in both speed and scalability of analytics, a key capability in the emerging Internet of Things market. “There are lots of companies that can work around Spark but very few who can do Spark at scale,” Cunitz said.
The company also announced an agreement with Hewlett Packard Enterprise Co. under which Hortonworks will work to encourage adoption of HP Enterprise’s memory management and optimization technology by the Apache Spark project. HP Enterprise Chief Technology Officer Martin Fink said his company is seeing a 15-fold Spark performance improvement using Hortonworks’ technologies. “We’re open-sourcing our better memory usage technology and the best place to do that is where Hortonworks operates, which is 100% open and totally collaborative,” he said.
The move is likely a competitive swipe at IBM, which announced a major corporate commitment to spark last June. Unlike IBM, HP does not have a big footprint in the open source world, and the partnership with Hortonworks is its best bet to gain acceptance of its technology.
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU