UPDATED 12:27 EDT / NOVEMBER 04 2011

NEWS

5 Distinct Hadoop Deployment Patterns

In an e-mail interview this week with Forrester Senior Analyst James Kobielus, I asked about Hadoop’s real-time capabilities. The conversation turned to what he sees as five distinct Hadoop deployment patterns.

It’s a good primer for Hadoop World next week where SiliconAngle will live stream from theCube.

Here they are:

Leveraging Hadoop Proprietary Distros: Use proprietary near-real-time/real-time features of some commercial Hadoop distros (e.g, HStreaming, Outerthought, Hadapt)

Leverage Hadoop Core Sub-Projects: Use Hbase as database/storage layer for near-real-time analysis and Cassandra for real-time requirements beneath your MapReduce modeling/execution abstraction layer.

Leverage Hadoop and Other NoSQL Databases: Supplement and/or replace Hbase/Cassandra with Membase, Couchbase, or other real-time and/or in-memory databases under MapReduce.

Leverage Hadoop and Real-Time Features Commercial Enterprise Databases and Data Warehousing Platforms: Support batch or real-time features of Hadoop (open source and/or proprietary distros) with changed data capture, complex event processing, or other real-time data ingest/processing features of commercial enterprise data warehouse (EDW) such as Teradata, Oracle Exadata, IBM Smart Analytic System, EMC Greenplum Database and other commercial offerings.

Leverage Hadoop and Stand-Alone Complex Event Processing or Message Oriented Middleware: Support batch or real-time features of Hadoop (open-source and/or proprietary distros) with complex event processing and/or message oriented  middleware (MOM) from IBM, SAP/Sybase, Streambase, TIBCO, etc.

Services Angle

One thing that Kobielus points out is Hadoop’s immaturity. For example, in his report: Enterprise Hadoop: The Emerging Core of Big Data, Kobielus says that among Hadoop specifications, only Cassandra offers transactional functionality to a wider range of enterprise applications above and beyond Hadoop’s core focus on advanced analytics. The proprietary vendors have added features to bring online transaction processing functionality—such as two-phase commit and rollback—to their offerings.

Kobielus says vendors are offering their own extensions such as real-time and high-availability—to address limitations of the current Apache Hadoop open-source distribution. “The Hadoop community is evolving the core codebase to address these deficiencies, but it may take several years before the open-source distribution becomes a more robust cloud analytics and transaction platform.”

The reality: Hadop is still very early in its development. But is it too slow? That’s a question we will be asking a lot next week at Hadoop World.


A message from John Furrier, co-founder of SiliconANGLE:

Support our open free content by sharing and engaging with our content and community.

Join theCUBE Alumni Trust Network

Where Technology Leaders Connect, Share Intelligence & Create Opportunities

11.4k+  
CUBE Alumni Network
C-level and Technical
Domain Experts
15M+ 
theCUBE
Viewers
Connect with 11,413+ industry leaders from our network of tech and business leaders forming a unique trusted network effect.

SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.