

In an e-mail interview this week with Forrester Senior Analyst James Kobielus, I asked about Hadoop’s real-time capabilities. The conversation turned to what he sees as five distinct Hadoop deployment patterns.
It’s a good primer for Hadoop World next week where SiliconAngle will live stream from theCube.
Here they are:
Leveraging Hadoop Proprietary Distros: Use proprietary near-real-time/real-time features of some commercial Hadoop distros (e.g, HStreaming, Outerthought, Hadapt)
Leverage Hadoop Core Sub-Projects: Use Hbase as database/storage layer for near-real-time analysis and Cassandra for real-time requirements beneath your MapReduce modeling/execution abstraction layer.
Leverage Hadoop and Other NoSQL Databases: Supplement and/or replace Hbase/Cassandra with Membase, Couchbase, or other real-time and/or in-memory databases under MapReduce.
Leverage Hadoop and Real-Time Features Commercial Enterprise Databases and Data Warehousing Platforms: Support batch or real-time features of Hadoop (open source and/or proprietary distros) with changed data capture, complex event processing, or other real-time data ingest/processing features of commercial enterprise data warehouse (EDW) such as Teradata, Oracle Exadata, IBM Smart Analytic System, EMC Greenplum Database and other commercial offerings.
Leverage Hadoop and Stand-Alone Complex Event Processing or Message Oriented Middleware: Support batch or real-time features of Hadoop (open-source and/or proprietary distros) with complex event processing and/or message oriented middleware (MOM) from IBM, SAP/Sybase, Streambase, TIBCO, etc.
One thing that Kobielus points out is Hadoop’s immaturity. For example, in his report: Enterprise Hadoop: The Emerging Core of Big Data, Kobielus says that among Hadoop specifications, only Cassandra offers transactional functionality to a wider range of enterprise applications above and beyond Hadoop’s core focus on advanced analytics. The proprietary vendors have added features to bring online transaction processing functionality—such as two-phase commit and rollback—to their offerings.
Kobielus says vendors are offering their own extensions such as real-time and high-availability—to address limitations of the current Apache Hadoop open-source distribution. “The Hadoop community is evolving the core codebase to address these deficiencies, but it may take several years before the open-source distribution becomes a more robust cloud analytics and transaction platform.”
The reality: Hadop is still very early in its development. But is it too slow? That’s a question we will be asking a lot next week at Hadoop World.
Support our open free content by sharing and engaging with our content and community.
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.