UPDATED 12:00 EDT / NOVEMBER 17 2014

Is the future of Hadoop set? Experts talk market gaps, investment opportunities

elephant suitHadoop now powers Big Data applications for more than half of Fortune 50 companies, used by businesses like Yahoo! Inc., IBM, and Facebook, Inc. So as the need for Big Data analytics continues to grow, so will the demand for Hadoop. The global Hadoop market is projected to increase from $1.5 billion in 2012 to $8.74 billion in 2016, reaching $50.2 billion by 2020, according to market analysts.

The ecosystem supporting Hadoop continues to evolve, as evidenced during this year’s Hadoop World Conference in New York, where companies like Cray Inc., Tableau Software, Inc., and Revolution Analytics Inc. revealed new Hadoop-based tools. According to a recent survey from research firm Wikibon.org, 65 percent of Big Data practitioners have already shifted resources from EDW to Big Data and Hadoop. An additional 35 percent plan to do so by the end of 2014.

Another interesting indicator of Hadoop’s current influence: Hortonworks Inc., the second-largest Hadoop distributor, just filed for an initial public offering (IPO) scheduled to take place in 2014 or 2015. But is the future of Hadoop set? Here are some insights and forecasts about the future of this open-source technology.

Enterprise experimentation increases for Hadoop

 

Clint Sharp, Director of Product Management, Big Data & Operational Intelligence, Splunk, Inc.

Apache Hadoop is maturing as a loosely coupled stack for inexpensive batch storage. Hadoop can store an abundance of data and has the potential to serve a variety of analytic and data science applications, from e-commerce customer segmentation and A/B testing to fraud detection, machine learning, and medical research.

uncertain road path street question mark predictive analytics direction futureBut trying to explore, analyze, and visualize data in Hadoop has often meant significant work manually writing jobs or setting up predefined schemas, which takes time and keeps vital data out of the reach of business and IT. New tools and analytic platforms, such as Hunk software from Splunk, allow even non-technical users to explore and understand massive data sets. These tools are enabling organizations to start to exploring, analyzing, and visualizing unstructured data in Hadoop in hours instead of weeks and months.

Many Fortune 500 enterprise customers are sampling Hadoop distribution vendors before deciding to select and standardize one Hadoop distribution. It’s worthwhile from the customer perspective to have multiple options for Hadoop distribution support. Security has been one of the limitations of Hadoop clusters. Kerberos for authentication is worthwhile but by itself is insufficient. The recent security acquisitions by Hadoop-based software providers help make Hadoop a less risky addition to organization enterprise architectures.

Hadoop offers an easy and cheap way to store data. However, this frequently creates issues because the data sets become too big to move and it becomes difficult to get analytics out of their Hadoop clusters. Hunk: Splunk Analytics for Hadoop complements alternate approaches using Apache Hive, SQL on Hadoop or in-memory analytics stores by providing exploratory analytics without the requirement to write fixed schemas or move data. It offers a faster, easier way to unlock the value from huge amounts of historical data at rest.

Source: SiliconANGLE Survey

 .

Market missing tools once Hadoop’s installed

 

Sam Grocott, Senior Vice President of Marketing & Product Management, ETD, EMC

We see the conversation going down two different paths. How do you build more value into the Hadoop stack northbound into more management interfaces and other applications that are taking advantage of Hadoop data, then southbound where the storage layer lives? That’s where the EMC business typically hunts historically, and that’s where we really see our differentiated opportunity with the entire ecosystem. We think that’s a match made in Heaven both from an architecture standpoint and an application focus.

puzzled puzzle piecesIf you look at the amount of data that’s under Hadoop today… some people believe about half of all data stored will be under Hadoop management [in five or six years]. So it’s a huge industry shift going on to leverage data, huge investments within content and storage, able to analyze against that. We want to make sure we have best-of-breed technology, whether it’s a scale-out file architecture or object architectures, converged Hadoop storage or server-side Hadoop storage.

The biggest gap specifically with enterprise buyers is understanding what it takes to manage [Hadoop] over time. There is a huge lack of core expertise in the enterprise space about how to actually take advantage of Hadoop, not just set it up. It’s moved beyond just the technology point; it’s ‘how do I have the soft skills, management skills, and the wisdom to actually pull that information out and use these tools to maximum value?’. The market is really missing the post-integration of how to really use these tools.

Source: theCUBE Interview

 .

Where should Big Data investors put their money?

 

Jeff Kelly, Big Data Analyst, Wikibon.org

The enterprise data warehouse has not lived up to its promise of a 360-degree view of the customer, a single version of the truth, ubiquitous business intelligence capabilities for business users (not just analysts and data scientists), and real-time actionable intelligence. So, will Big Data live up to the hype?

When you’ve got a market that’s very heavy on services and hardware, it might not be surprising that today the leaders in terms of revenue in Big Data are IBM, Hewlett-Packard Co., and Dell Inc.

But what’s happening in the “real” Big Data market? Of the total $50 billion market we think that’s going to happen by 2017, between $3 – 4 billion of that is going to be generated through Hadoop and NoSQL software. Granted, this is a small slice of the larger market, but what’s important to remember here is Hadoop and NoSQL are two of the foundational technologies in the digital fabric. So, this is a critical component of the Big Data market and really where a lot of the innovation is coming from.

 .

Public market potential

wall street wall st stop sign black and white urban city NYC financial districtSo where do investors want to put their money? On the public markets, there aren’t too many options out there. Big Data “pure-plays” include Splunk, Inc., Tableau Software, Inc., and Qlik Technologies, Inc. But what everyone really wants to know about is, when are the Hadoop pure-plays going public? We think there are only three Hadoop distributions that matter: Cloudera Inc., MapR Technologies, Inc., and Hortonworks. I think MapR and Hortonworks are more likely to go public in the shorter term.

The Big Data market is just the Wild West right now. It’s important to boil it down to two basic but important questions that we consistently get from the Wikibon community. 1. Will anybody make any money in open source Big Data? 2. Will Oracle, SAP SE, Teradata Corp — all the industry heavyweights — swallow up a lot of these Big Data startups that we’re seeing in the ecosystem?

In my opinion, yes, there will be money to be made in open source Big Data, and I do think there will be a billion-dollar Big Data software company. If I had to place my bets, I’d say the most likely candidates are Cloudera or Hortonworks. Regarding the second question, there is going to be consolidation in this market. There are too many vendors for this market to support in the long term.

 .

Biggest Big Data winners

So who are going to be the biggest winners in Big Data? In our opinion, the biggest winners in Big Data are going to be the practitioners. We believe the companies that leverage the digital fabric to create new lines of business, new markets — for example, companies like Uber Inc., Netflix Inc., Intuit Inc. — and really disrupting some of the old-line industries.

Source: Big Data NYC analyst presentation for theCUBE
feature image by Murilo Morais. Photo credits: milos milosevicCarbonNYC and nromagna via photopin cc

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU