UPDATED 16:00 EDT / APRIL 20 2018

BIG DATA

Hortonworks enlists Apache Atlas to follow the trail of data breadcrumbs for GDPR

With the deadline for compliance with the General Data Protection Regulation barely a month away, companies are looking for tools to track and identify any and all information on European citizens that could be contained in corporate databases. The law provides European citizens with the “right to be forgotten,” but to meet that standard, companies will need to know what to forget.

“If I’m a new customer and I ask to be forgotten, the only way that you can guarantee to forget me is to know where all of my data is,” said Scott Gnau (pictured), chief technology officer of Hortonworks Inc. “We’re creating that metadata, creating that trail of breadcrumbs that lets you piece together what’s there, what’s the relevance of it, and how you might use it for some correlation.”

Gnau spoke with James Kobielus (@jameskobielus), host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, at the DataWorks Summit EU in Berlin, Germany. They discussed the release of a new service to manage information and how Hortonworks is leveraging its tracking tools through the data lifecycle. (* Disclosure below.)

A system for scalable governance

To properly leverage metadata tagging and assist customers with GDPR compliance, Hortonworks recently released Data Steward Studio, a service designed to understand and govern information across data lakes. The new service builds on Hortonworks’ expertise in Apache Atlas, a system for scalable data governance within and outside of the Hadoop stack.

“What we’ve been trying to do is leverage our expertise in metadata management using the Apache Atlas project,” Gnau explained. “It’s all about finding the data that you have, where it is, where it came from, what’s the lineage of it, who had access to it, and what did they do to it. These are all governance kinds of things that are also mandated by laws like GDPR.”

Over the years, Hortonworks has expanded the Hadoop stack with tools to manage streaming data, such as Apache Nifi, Storm and Kafka. “There’s an opportunity to uplevel those services from an overall security and governance perspective,” Gnau said. “The whole idea behind doing that was to expand our footprint so that we would enable our customers to manage their data through its entire lifecycle.”

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the DataWorks Summit EU. (* Disclosure: Hortonworks Inc. sponsored this segment of theCUBE. Neither Hortonworks nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU