

Cloudera Inc. is stepping up its data streaming game with the integration of the latest release of Apache Kafka into its popular Cloudera Enterprise Hadoop distribution.
For data-driven organizations, it’s no longer enough to simply get Hadoop up and running. What with the enormous amounts of data such organizations are spewing out, they need newer, faster, and more scalable tools to keep up with it. It’s for that reason we’re seeing so many organizations adopt tools like Apache Spark, which can process data 100 times faster than MapReduce. Apache Kafka too, is all about speeding up the analytics process to the next level.
Apache Kafka, first developed by LinkedIn, can be thought of as a kind of “circulatory” system that pumps Big Data throughout an organization – collecting data such as application metrics, user activity, logs and stock tickers and transforming it into a stream of data (like a blood stream) that’s fed into Spark or other analytics software. The latest release adds critical security features, advancements in multi-tenant operations, and a simplified development experience for Big Data pipelines. Together, these updates enable users to more easily ingest and tap into the value of the growing volumes of data streaming from today’s world of connected devices.
“Secure, reliable pipelines for real-time data has never been more important,” said Charles Zedlewski, vice president, Products at Cloudera. “Our customers in every industry are facing a huge challenge: ingesting huge volumes of data from the growing wave of IoT-connected devices, especially as they’re looking to secure and manage this data as it streams into their enterprise data hub.”
Announcing the new release, Cloudera was keen to talk about how some of its customers have been using Kafka to speed up and get more performance out of their existing Hadoop deployments. It cites the experience of conferencing service provider Cisco WebEx, which was able to detect up to 17 times more fraud and witness a significant boost in its customer ratings when it implemented Cloudera Enterprise with Kafka. WebEx uses Apache Spark to process streaming data in real-time, while Kafka shares that data with its services and fraud teams so they can act on it.
Other examples include healthcare solutions provider Cerner, which claims to have saved “hundreds of patient’s lives” by developing new patient monitoring systems that can detect blood infections requiring immediate treatment. Those new solutions depend heavily on Kafka to send the necessary data to where it needs to go.
“Now that the latest version of Kafka is integrated directly into Cloudera’s platform, our customers can ensure their data pipelines meet the same stringent security requirements as the rest of their business,” Cloudera’s Zedlewski said. “With added enterprise capabilities such as rolling restarts and industry-leading monitoring and troubleshooting, customers are able to focus on the value these new data sources and applications provide, not on manual administration of the underlying tools.”
THANK YOU