UPDATED 17:05 EST / JUNE 14 2012

Facebook’s AvatarNode Solves a Major Hadoop Pain Point With High Availability

Facebook is such a big fan of Hadoop Distributed File System (HDFS) that it stores 100 petabytes of data in a single HDFS filesystem, across 100 clusters – including very probably the largest single cluster of its type.

So it’s no wonder that Facebook’s engineering team went out of its way to solve the single point of failure problem (SPOF) problem that’s characterized Hadoop to date, an approach that Facebook engineer Andrew Ryan explained at this week’s Hadoop Summit (and summarized in a blog post here).

The way a standard HDFS client works is such that all filesystem metadata requests have to be filtered through a single server called the NameNode, with filesystem sending and receiving shunted through a pool of Datanodes. These Datanodes are redundant, and the filesystem can handle the failure of any one. But if that one NameNode goes down, HDFS is functionally out of commission, and no applications connected to it will function properly.

For Facebook, a highly-available NameNode is highly desirable: Ryan estimates that while such a thing is only a preventative measure against 10 percent of unplanned data warehousing outages, it could shave off as much as 50 percent of planned downtime by allowing the team to do hardware and software maintenance on the cluster without having to flip the switch.

Enter Avatarnode, named for the James Cameron movie, providing a manual failover point for HDFS NameNodes. Ryan’s diagram illustrates the basic gist:

 

 

 

 

 

 

 

 

 

 

AvatarNode, which Facebook has open sourced back to the community, provides a two-node NameNode with manual failover. This was achieved by wrapping Facebook in the Apache Zookeeper configuration management tool. AvatarNode is now used in production, handling some of Facebook’s heaviest workloads. You can grab Facebook’s distribution of Hadoop, including AvatarNode, here.

“Moving forward, we’re working to improve Avatarnode further and integrate it with a general high-availability framework that will permit unattended, automated, and safe failover,” as Ryan puts it in that blog entry.

Eliminating NameNode as an SPOF is the goal of many. Cloudera and MapR both include redundant NameNode systems in their Hadoop distributions, and the currently-in-beta Apache Hadoop 2.0 itself also removes NameNode as a single point of failure with a similar manual failover approach.

In the meanwhile, enterprises looking to use Hadoop/HDFS in production might well be able to take advantage of Facebook’s recipe for high availability.


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU