Big data market enters high growth phase, intersects with public cloud | #BigDataSV
Last week’s BigData SV/Hadoop-Strata 2016 event presented a market in ferment, with high growth supporting many new startups and attracting big vendors like IBM, Oracle and Hewlett-Packard Enterprise (HPE). TheCUBE, which has covered big data since the days when SiliconANGLE shared office space with big data pioneer Cloudera, Inc., presented three days of live-streamed in-depth interviews of industry leaders, old and new, and identified several important trends and issues. The recordings are available here.
Big data is set for explosive growth. Wikibon estimates that the market totaled $20 billion in 2014 and will top $92 billion by 2026, a compound annual growth rate (CAGR) of 14-16 percent (see video below). This is the start of a market large enough to provide ample opportunities for new entrants for several years and too big for a winner-take-all finish, SiliconANGLE Media Chief Research Officer Peter L. Burris (@PLBurris) said in his presentation at the SiliconANGLE Media BigData SV get-together Wednesday night.
Today’s market is highly fragmented with new startups appearing constantly. The largest big data market share belongs to IBM at only 9.3 percent of the total market. SAP in second place has 3.9 percent, followed by Oracle at 3.3 percent and HPE at 3.0 percent. The largest share in the chart Burris displayed in his talk was “all other,” with 67 percent of the market.
The SiliconANGLE/Wikibon team on the ground (SiliconANGLE Media Co-CEO John Furrier (@furrier), Wikibon Big Data Analyst George Gilbert (@ggilbert41), theCUBE Co-Host Jeff Frick (@jefffrick) and Burris) agreed that the entry of big vendors will drive customer innovation by facilitating the organizational and operational changes companies need to make to reorient to digital business and get full value from their big data investments. Startups develop new technologies but lack the clout and resources to drive that kind of cultural change in large businesses.
The complexity issue
On theCUBE SiliconANGLE co-CEO John Furrier said the main impediment to growth is the high complexity of the Hadoop technology. A full Hadoop stack requires 30-plus open source pieces that are not designed to work together easily, lack a coherent set of open APIs and are in constant high velocity evolution. This issue is driving two important trends.
The first is the fast growth in popularity of Apache Spark over the last year. Spark comes with an integrated stack that uses a coherent set of open APIs, making it much easier to install and operate. It also differs from Hadoop, which was designed for batch processing, in that it is a near-real-time analysis platform that is getting enthusiastic support from several important analysis vendors. This makes Spark the leading platform for forward-looking micro-trend analysis and customer support. Its main disadvantage is that it does not hold onto data. Companies still need Hadoop or a high-volume NoSQL database to capture large amounts of data for deep analysis and business compliance. For more on Spark, look here.
The other trend, which was new to this year’s conference, is that some early adaptors are turning to public cloud services to capture and store those high volumes of data. Providers like Amazon Web Services (AWS), Microsoft’s Azure, Google and IBM SoftLayer offer high volume storage at very low cost, hiding the complexity of the stack.
Production systems
Analytics, the team agreed, is the key to realizing business value from big data investments. This conference saw the first strong reports of companies moving from Hadoop trials to full production systems, with a couple of the interviewees reporting that some of their clients already have two or three production big data systems in operation.
The big issue in analytics is the dearth of data scientists and data engineers. The supply is never likely to catch up with demand. Furrier said it reminded him of comments from the early days of the automotive industry that cars would never catch on because there weren’t enough chauffeurs. The answer, which the data analytics vendors discussed at length on theCUBE, is automation. The leading analytics vendors use machine learning to capture the expertise of teams of data scientists to create tools that business users can access directly and that guide users to create analytics on the fly that answer the business questions that will create the huge amounts of new value that big data promises.
Putting a value on data
Another issue is putting a value on the data that companies collect both in their data warehouses and new data lakes. Frick and Furrier talked about a Las Vegas casino hotel that recently went through a bankruptcy, only to discover afterward that the data it had on its customers was worth $1 billion. If it had known that earlier, it might have avoided the bankruptcy.
Data is the new currency of the digital business era. But actually equating that to monetary value is not easy. Furrier said that, ultimately, the value of data will not be established until a data marketplace is created. But companies need to be conscious that their data does have real value, and they need to account for that among their business assets.
Next Week theCUBE will go to Hadoop Summit – Dublin to get the European view of big data. Watch two days of live streaming from the show on live.siliconangle.tv Wednesday and Thursday, April 13-14.
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU