UPDATED 13:12 EDT / NOVEMBER 05 2012

NEWS

100 Petabytes Too Big Data? Well, Not Enough for Facebook

How much data is just enough? Something like 105 Terabytes every 30 minutes, or a massive volume of 100 Petabytes? Well, if you think this is just too much, you are wrong as there is an entity for which this gigantic data volume is not enough. None other than our favorite social networking site Facebook has surpassed the limits of Hadoop, whose total volume currently weighs at 100 Petabytes.

Every day, Facebook receives 2.7 billion Likes, while 2.5 billion content items are shared on the social networking site. It uses Hadoop to empower many of its features, like messaging, along with optimizing its advertising performance and to conduct data analysis. With Hadoop’s data analysis techniques, it determines the effectiveness of features or advertisements against each other based on specific demographics, and also leverage the results to tweak features and improve targeting.

Facebook utilizes Hive, an open source project created by Facebook that is the most widely used access layer within the company to query Hadoop using a subset of SQL, and HiPal, social network’s homegrown, closed source, and end-user tool. It needs all these in order to handle and analyze its gigantic volume of data. While Hive allows Facebook to have business intelligence, HiPal compliments it by enabling data discovery, query authoring, charting, and dashboard creation in graphical form.

So, what the scenario is that Facebook has reached the upper limit of raw Hadoop capacity by declaring itself the world’s largest Hadoop cluster.

It has also started the Prism project in order to overcome the limitation of Hadoop, which is that Hadoop must confine data to one physical data center location. With Prism, a logical abstraction layer is added so that a Hadoop cluster can run across multiple data centers, effectively removing limits on capacity. Facebook says it will open-source Prism soon.

After all, it is certain to expand its database!


A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU