Collaborating to drive data cataloging | #BigDataNYC
The exponential growth of data by volume and type makes it necessary to provide referential resources for collaboration among enterprise users, and one team up is taking on the challenge. With Q4 plans to release a new connectivity layer that catalogs queries from popular compute engines like SparkSQL and IBM Watson DataWorks, Alation, Inc. has caught the eye of Teradata Corp. for a re-sell partnership.
Stephanie McReynolds, VP of Marketing at Alation, and Mark Shainman, marketing director at Teradata, joined Dave Vellante (@dvellante) and Peter Burris (@plburris), cohosts of theCUBE, from the SiliconANGLE Media team, during BigDataNYC 2016 to discuss their partnership, how Data Catalog works for customers and how to handle big data.
Do you have a data lake or a data swamp?
Vellante brought up the point that while there is much complaining about Hadoop, including its data lake concept, it did get the data to where it needed to be. How companies deal with that data after collecting it is the issue, and that’s where Alation and Teradata come into play.
“Is it a data lake or a data swamp? … Different organizations are [all] at different phases of figuring out the data lake … [but they all] need governance,” said McReynolds. The more users that come into the lake, if there’s no way for them to see what’s already in the lake and what the quality of that information is, that data, so carefully collected, can be useless. So it’s necessary to have “a catalog that reads and interprets data … as we get more people running queries … we need something like a data catalog to see and understand what’s in there,” continued McReynolds.
Presto (an open source SQL query engine that Facebook developed) was designed and written for interactive analytics and approaches the speed of commercial data warehouses, while scaling to the size of organizations. “[Presto was built by Facebook], then they open-sourced it. [Teradata] is a major contributor to the code base,” said Shainman. Teradata sees Presto as filling a specific niche, primarily running interactive queries against large sets of data with low latency and many users.
Handing Big Data
The discussion moved to Teradata’s play in Big Data. Vellante asked, “What’s the most important part of your Big Data?”
Shainman answered: “Hadoop and Big Data are all synergistic to the data warehouse … [we realize] that multiple platforms are going to exist in one organization. … We’ve moved away from this silo[ed] set up … Alation brings in the governance and cataloging.”
Watch the complete video interview below, and be sure to check out more of SiliconANGLE and theCUBE’s coverage of BigDataNYC 2016.
Photo by SiliconANGLE
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU