UPDATED 14:04 EDT / APRIL 13 2016

NEWS

The hidden Facebook: Panel sheds light on News Feed design | #F82016

Facebook: the social network so big and widely used, you probably have it open in another tab while reading this article. Every day, countless activities around the world go through Facebook, from sharing a piece of breaking news to liking a picture of someone’s dog. (I myself have liked more than a fair share of dog pictures.)

But what supports such a massive social network? It’s easy to ignore the multitude of computations applied every moment someone posts a comment, but for those who work at Facebook, the website’s infrastructure is their bread and butter.

At F8, the Facebook Developers Conference, a two-part panel examined the guts of social networking with “Inside Facebook’s Infrastructure: The System That Serves Billions.”

What Goes Into Every Interaction

Domas Mituzas, a Production Engineer at Facebook, hosted the first part. He explained how “every interaction begins with a single person who has something to share.” Just to make a single post appear requires multiple processes, taking into account things most people don’t even consider.

IMG_0408For example, regional differences require different PoP (Point of Presence) locations, which host Facebook’s services in key spots to provide fast processes that account for each region’s networks. The different posts users made from their locations use different parameters, different clusters, and different data centers to ensure everyone gets an optimal experience.

The data centers themselves, however, are located in large buildings, using fabric network topology to enable high speeds. They are built in such a way that pods of racks are connected to planes, which are connected to spines, which are connected to backbones, and so on, to allow for bandwidth scaling that matches the requirements of the network. The Facebook Open Switching System scales and manages the processes, making the data center’s servers capable of running all the required applications.

IMG_0409

Said applications began standard, but as Facebook grew and improved, more and more features were added. Applications interact with the data by looking at it in terms of objects (posted content such as statuses, photos, and videos) and associations (reactions, likes, and so on).

IMG_0411

However, that’s just for posts and likes. How does Facebook decide what shows up on your newsfeed? For that, it uses a multi-feed system. Groups of servers in multiple racks aggregate the different leaf nodes (representing user posts), determining which newsfeeds they’ll be sent into.

All of these calculations happen in under a second, and each interaction — every like or smiley face emoji, every long-winded rant riddled with typos and grammar that would make a first-grader wince — is sent back through the system to be added to those posts. Now keep in mind that millions of these interactions happen every second of every day; the servers are constantly working to ensure that every single interaction goes through.

IMG_0413

Infrastructure Part 2

For the second half of the panel, fellow Production Engineer Rachel Kroll took the stage. She explained how the infrastructure works and is made more reliable; as she put it: “Keeping the site up, or in other words: when things go very, very wrong.”

The role there is to be both proactive and reactive; to stop problems before they appear, and to resolve them immediately. Kroll puts it in terms of three types of problems: something is down/inaccessible, something is really slow, and something seems amiss although it’s not clear what.

 Find and Resolve

There are four simple ways the Facebook team can identify these problems: through egress, response time, fatal error flaws, and attempts vs. completions.

Egress is a measure of how much content is being pushed to the outside world, through Facebook posts, comments, and other media. In other words, all the stuff sent over Facebook each moment. Any time there’s a sudden change in activity, such as huge drops or increases in data usage, it’s noted and tracked to see if there’s a reason for it. For instance, big world events often result in huge upswings in activity, as shown during egress tracking during a sporting event.

IMG_0418

Tracking response time is important not just for customer satisfaction, but to keep track of the website’s health. The requests should be at or below a certain consistent time frame, or else something is very wrong.

IMG_0421

When something is very wrong, Facebook cares about the cause and want to investigate. Requests and failures are tracked in parts per million (ppm), which is an essential measure when a website is as large as Facebook. They find strange or unintended configuration errors, determine why something went wrong, and learn from it to prevent any more issues from occurring.

One interesting metric is when they compare users starting something to finishing it. That’s not in regards to that lengthy rant you considered posting about everything that annoys you, but rather, starting to use a feature then giving up halfway through.

For instance, Facebook will occasionally roll out a new feature to change the filter on one’s profile picture to commemorate certain events. Early on, Facebook noticed a large amount of people were giving up halfway through the conversion process, until they realized that using the large, high-quality original photos was a time-consuming process. Instead, they started converting the lower-quality profile pictures, and suddenly the rate of users completing the filter shot up. Those metrics are important for providing the optimal experience for users, or else they may follow through on those empty threats to “return to MySpace.”

Photos by Robert Pleasant

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU