Hadoop and Splunk Make Hunk: A Cute Name for a Powerful Big Data Tool
Hadoop is by far one of the most powerful and most popular open-source frameworks for supplying Big Data applications with much needed storage and access to terabytes of data and more–and as well all know storage and collection are becoming easier, but as a result analysis is becoming more complex. Splunk is already also an extremely well known analysis tool with wide ranging capabilities from pulling business intelligence out of logs to visualizing and enhancing security; and Spunk does this by being able to tie together data from numerous sources agnostic to the structure and type of data.
As a result, when I was introduced to Hunk (now in beta) by Splunk’s representatives it became obvious how Hadoop and Splunk make an excellent pairing. Also, anyone who isn’t on board with a tool named “Hunk” probably doesn’t have a sense of humor about DevOps and big data tools–we have to keep this serious business in perspective somehow.
Hunk uses a familiar Splunk backend to access Hadoop databases (and mix together any other data sources) to provide a great deal of information filtering at almost the press of a button. Anyone who is already familiar with Splunk’s query language will have virtually no learning curve plumbing into a Hadoop-based big data store and those who don’t know the language will probably not find it hard to learn. With Hunk users will find the ability to index Hadoop, explore all the data in one place, perform real-time interactive analysis, and even produce reports, graphs, and everything an analyst needs to communicate their findings.
“Hunk is an important addition to the Splunk product portfolio. Our customers love how Splunk software enables them to easily visualize and analyze data, and they asked us if we could help them do the same on the sizeable low-cost data stores they’ve built up in Hadoop. To create it, we extended our technology with a new patent-pending virtual index technology,” said Guido Schroeder, senior vice president of products, Splunk. “Hadoop is a tremendous technology full of potential – if you can get to the data and act on it. We developed Hunk as a standalone software product to help organizations give broader user groups insight into their data assets without custom development, costly data modeling or lengthy batch processing iterations. By providing interactive data exploration, discovery and analytics, Hunk empowers users to derive actionable insights from this raw data in Hadoop.”
Tools like Hunk will help us stop treating data scientists like “data butlers”
As with any emerging science there’s a layering of what sort of jobs interact with data science and that differentiation can become extremely blurred. For DevOps teams and IT in general this has become even more true, especially with the breaking of barriers between software engineers, operations engineers, and now even analysts.
In order to understand how Hunk would change the environment for the use of Hadoop for collecting and analyzing big data Clint Sharp, Splunk senior product manager, mentioned how businesses and scientists are looking at terabytes or petabytes of data that they’re storing and ask: “How can we derive value from this data?” The result is, instead of doing straight data science–building schema, or routes of analysis–data scientists are becoming gatekeepers for analysis or, as Sharp succinctly put it, they’re becoming “data butlers.”
Hunk helps analysts (one role of data scientists) by allowing them to build business intelligence by essentially thinking out loud. So, how does Hunk become a game changer? It introduces schema on-the-fly (or late-binding schema.) In order to analyze data, it must be structured somehow–but most Hadoop data is unstructured–Splunk tries to make the data structured as absolutely late as possible enabling the software to pull vast unstructured data out of Hadoop and transform it on-the-fly so that manpower isn’t needed to organize beforehand.
Splunk essentially allows the exploration of data in its amorphous fashion, and do analytics on the data before a business needs to put a lot of investment into organizing it.
This means that data scientists can get back to doing what they do best–thinking up how to present reports based on the data given to them as well as develop schema from current and historical data–and developers and engineers can put their expertise into play by focusing on how information sources are being collected, building out and looking back into the data, and supporting what the data scientists need for surveying the results.
Not just a pretty name: Hunk has a lot to look forward to
From the presentation I saw, Hunk provides an extremely powerful and versatile tool to big data professionals as well as neophytes. Splunk has one of the best tool sets that I’ve seen for data analysis and as a system I have seen it moving through every layer of the data ecosystem. That means that as a product, we can expect Hunk to add a lot of value to Hadoop installations.
As more and more data becomes accessible to businesses, systems, governments, and the intensity increases–especially in light of the Internet of Things for consumer applications and the Industrial Internet on the enterprise side–the ability for DevOps and IT teams to keep up with not just the colossal proliferation of data but keep up with how to analyze and use it.
Businesses in the “business of data” are looking to pull as much value of of the collected data as possible and it’s Hunk’s mission to make the value in that data more accessible.
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU