UPDATED 17:05 EST / JUNE 14 2012

Facebook’s AvatarNode Solves a Major Hadoop Pain Point With High Availability

Facebook is such a big fan of Hadoop Distributed File System (HDFS) that it stores 100 petabytes of data in a single HDFS filesystem, across 100 clusters – including very probably the largest single cluster of its type.

So it’s no wonder that Facebook’s engineering team went out of its way to solve the single point of failure problem (SPOF) problem that’s characterized Hadoop to date, an approach that Facebook engineer Andrew Ryan explained at this week’s Hadoop Summit (and summarized in a blog post here).

The way a standard HDFS client works is such that all filesystem metadata requests have to be filtered through a single server called the NameNode, with filesystem sending and receiving shunted through a pool of Datanodes. These Datanodes are redundant, and the filesystem can handle the failure of any one. But if that one NameNode goes down, HDFS is functionally out of commission, and no applications connected to it will function properly.

For Facebook, a highly-available NameNode is highly desirable: Ryan estimates that while such a thing is only a preventative measure against 10 percent of unplanned data warehousing outages, it could shave off as much as 50 percent of planned downtime by allowing the team to do hardware and software maintenance on the cluster without having to flip the switch.

Enter Avatarnode, named for the James Cameron movie, providing a manual failover point for HDFS NameNodes. Ryan’s diagram illustrates the basic gist:

 

 

 

 

 

 

 

 

 

 

AvatarNode, which Facebook has open sourced back to the community, provides a two-node NameNode with manual failover. This was achieved by wrapping Facebook in the Apache Zookeeper configuration management tool. AvatarNode is now used in production, handling some of Facebook’s heaviest workloads. You can grab Facebook’s distribution of Hadoop, including AvatarNode, here.

“Moving forward, we’re working to improve Avatarnode further and integrate it with a general high-availability framework that will permit unattended, automated, and safe failover,” as Ryan puts it in that blog entry.

Eliminating NameNode as an SPOF is the goal of many. Cloudera and MapR both include redundant NameNode systems in their Hadoop distributions, and the currently-in-beta Apache Hadoop 2.0 itself also removes NameNode as a single point of failure with a similar manual failover approach.

In the meanwhile, enterprises looking to use Hadoop/HDFS in production might well be able to take advantage of Facebook’s recipe for high availability.


A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.