The development of distinct, competing camps of vendors is one sure sign that an emerging market is maturing. This is precisely what we’re seeing now in the Big Data market.
This morning Cloudera and MongoDB announced they are forming a “strategic partnership” that will see the vendors even more tightly integrate their two technologies and, in their words, “help enterprises define long-term, successful big data strategies and manage modern data.” The two also said MongoDB Connector for Hadoop is now certified on Cloudera Enterprise 5.
The partnership makes sense on two levels. MongoDB and Hadoop are complementary technologies that are increasingly used in tandem – the former for supporting web-based applications that serve up lots of multi-structured data, the latter for storing and crunching that data for insights into customer behavior and other analysis. Better integration between the two technologies will make it easier for practitioners to close the feedback loop and tie analytic insight to transactional processing in near-real-time.
But taking a step back, this partnership fits into a larger trend we identified earlier this year. As Big Data early adopters look to move proofs-of-concept into production, they need to fit all the puzzle pieces together. That means integrating new approaches, such as NoSQL databases and Hadoop, together, and integrating new approaches with existing infrastructure and business process workflows. To expedite this process, and fuel the market, Big Data vendors are increasingly forming tight technology integration partnerships with one another and with legacy technology providers to aid this early adopter transition from PoC to production.
One consequence of this trend is the formation of alliances between vendors. Some of these alliances form due to like-minded technology/business model approaches, others as responses to moves from competitors. Most of the times both factors are at play. What we’re seeing in the Big Data market are the beginnings of these alliances. They are not etched in stone and a lot could change between here and the IPO market, but here’s what the landscape looks like now as I see it:
- Intel-Cloudera-MongoDB: Intel invested an unheard of $740* million in Cloudera last month, giving it an 18 percent stake in the company. Intel is likely to use its investment and influence to push Cloudera as the de facto Hadoop-based platform for supporting Internet-of-Things/Industrial Internet-based analytics. Intel is also a small investor in MongoDB and the two companies have worked to optimize the NoSQL database on Intel hardware. Meanwhile, MongoDB and Cloudera have similar approaches to their respective business models – i.e. subscription-based open core plus proprietary software and services.
- Hortonworks-Teradata-Microsoft-SAP: Hortonworks takes a 100 percent open source approach to Hadoop and has, from its inception, focused on core Hadoop and nothing more. This has endeared it to incumbent data warehouse and database vendors, who are keen to integrate with Hadoop rather than be displaced by it. This has allowed Hortonworks to establish reseller arrangements and technical integration partnerships with the heavy-hitters in the data warehouse market – Teradata, Microsoft, and SAP. Hortonworks is counting on these reseller arrangements to significantly ramp up revenue this year.
- Pivotal: Pivotal is part of the EMC Federation along with EMC itself, VMware and RSA. The story is more complicated than that, however. From a Big Data vendor partnership perspective, Pivotal is largely going it alone having developed its own Hadoop distribution and related analytic database products and tools (HAWQ/Greenplum and GemFire/GemFire XD.) But, Pivotal is also co-developing products with customers (such as Pivotal Data Dispatch developed by NYSE) and investors. Namely, Pivotal is working closely with General Electric, which invested $105 million for a 10 percent stake in Pivotal, on its Industrial Internet platform and tools. Pivotal takes a much more open approach on the PaaS side of its business with Cloud Foundry, collaborating with the likes of IBM, Intel, CenturyLink and others.
- IBM: Similarly, IBM is an entity unto itself on the Big Data front. The company has famously spent close to $20 billion over the last six years acquiring analytics and Big Data-related vendors, and has simultaneously developed a number of Big Data tools internally. These include BLU Acceleration, Watson, InfoSphere Streams and BigInsights (it’s Hadoop distribution layered with IBM tooling.) From a hardware and converged systems perspective, IBM recently introduced its Power8 chips designed to support scale-out Big Data deployments and its PureData line includes integrated appliances for Big Data analytics. IBM has also established a number of partnerships with emerging Big Data vendors (such as Datameer, Couchbase and even MongoDB), but nothing I would describe as strategic. IBM very much wants to be a one-stop-shop for Big Data.
- HP-MapR: HP’s approach to Big Data was complicated by the Autonomy acquisition debacle, but the company has regained its Big Data footing thanks largely to another, much more successful acquisition – Vertica. The Vertica business continues to show steady growth, and the technology is an important part of HP’s HAVEn reference architecture. HAVEn, introduced last year, integrates Hadoop, Autonomy, Vertica, and ArcSight to enable Big Data application development. HP Vertica also established a technical integration partnership with MapR earlier this year, enabling practitioners to run both technologies on the same cluster of hardware. Like IBM, HP also delivers its Big Data analytics software in appliance form optimized on HP hardware. But it is also an important hardware partner to Big Data software providers including SAP (HANA.)
The Big Data market is a fluid one and alliances can change quickly. And the open source nature of Big Data technology means there are a lot of non-formal integration efforts going on all the time. But with real money changing hands and engineering-level collaboration happening, we are starting to see these alliances solidify. This is largely a good thing for the market, as it gives customers a better frame of reference when evaluating potential vendors. I’ll have more analysis of these emerging alliances in later posts.
* Correction: This post incorrectly stated that Intel invested $900 million in Cloudera. Intel in fact invested $740 million in Cloudera.