UPDATED 12:04 EDT / JUNE 28 2013

NEWS

MapR Addresses Many Issues Limiting Hadoop’s Implementation In Production Data Centers | #hadoopsummit

John Furrier and Jeff Kelly, theCUBE co-hosts, sat down with Jack Norris, CMO of MapR, for a chat revolving around issues such as the adoption of Hadoop, the improvements in clients’ performance and the Enterprise-grade efforts.

What seems to have changed in this year’s event compared to the previous year, is the mentality, notes Norris. It’s no longer “if” we should use Hadoop, but “how”. At least that is his impression, because enterprises have now become very interested in adopting new technologies, once they realized the way it could impact their production scale.

  • MapR + Fusion-io

MapR is future-proofing the Hadoop cluster for organizations who look at tech an try to figure out what it means for their investment. “The announcement with Fusion-io was that we’re 25 x faster on reading HBase applications,” Norris says. “As organizations are deploying Hadoop, and looking at technology changes, they can rest assured that they’ll be able to take advantage of those in a much more aggressive fashion with MapR than with other distributions.”

  • The reality of enterprise-grade

”Enterprise-grade is more the UX than a marketing claim,” Norris explains. “We are talking about the capabilities and features that they’ve grown to expect – the ability to meet full SLA, full HA, recovery from multiple failures, rolling upgrades, data protection with consistent snapshots, business continuity, ability to share a cluster across multiple groups. There’s a host of features that fall under the umbrella of Enterprise-grade. And when you move from no support to any of those features to support to a few of them, that’s not going to HA, it’s more like moving to low availability.”

Picking on the term, Furrier asked Norris to expand a bit on the idea of “low availability”. “If you have an HA solution that can recover from multiple failures, that’s downtime. If you have an HBase application that’s running online, and you have data that goes down and it takes 10 to 30 minutes to have the RegionServers recover from another place in the distribution, that’s downtime,” says Norris.

“If you have snapshots that aren’t consistent across the cluster, that doesn’t provide data protection; there’s no point in time recovery for a cluster. So, it’s basically down to interruptions, downtime and the potential of losing data. Our answer is that you need a series of features that are hardened and proven to deliver that.”

MapR prides itself with extensive competencies in recoverability. “Right now there’s no point in time recovery for HDFS and HBase tables,” clarifies Norris. ”There’s snapshot support, and snapshots are being compared to copy table.” MapR develops, distributes, and supports a distribution of Apache Hadoop that addresses many of the enterprise quality issues currently limiting its implementation in production data centers. MapR replaces the Hadoop Distributed File System (HDFS) with one that eliminates weak points in HDFS, but is fully compatible with MapReduce, HDFS and HBase.

As MapR has announced before, Apache Hadoop could be made more enterprise ready by:

  • Eliminating well-known single points of failure (Name Node and Job Tracker)
  • Addressing the potential for data loss resulting from data corruption that can be propagated across data copies created by HDFS
  • Providing disaster recovery capabilities though the implementation of remote data mirroring
  • Advancing the manageability of Hadoop clusters by IT administrators who have little expertise with Hadoop going into enterprise data center-level implementations

There are three main areas that the Organizations are taking into account when adopting Hadoop:

  1. Ease of use and administration
  2. Dependability (includes the full HA)
  3. Performance

”What companies will start to realize is that instead of having large data moving across different sylos, they’re going to be faster, more agile, and more competitive by processing data in one place, sending small results sets,” says Norris. That puts the spotlight on what data platform is out there that can support a broad set of applications and have a broad set of functionality. “We are delivering an Enterprise-grade, mission critical support platform that supports map reduce, does high-performance, provides NFS, integrades Enterprise-grade, NoSql applications, so that the customers can do high-speed, consistent performance and real-time operations in addition to batch, streaming, integrated search,” he concludes.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU