UPDATED 02:29 EDT / DECEMBER 31 2015

NEWS

LinkedIn reflects on its open-source successes in 2015

With 2015 coming to an end, LinkedIn Corp. has taken a look back at its year of using, developing and contributing to open-source software.

Throughout the last year, LinkedIn made some of its biggest ever contributions to the open-source community by releasing ten new original projects, including Burrow, Goblin and Pinot, while pushing major updates to existing projects such as Apache Samza, Apache Kafka, Rest.li and Voldemort.

“We’ve worked to scale our infrastructure as we reached 400 million LinkedIn members, so it’s no surprise many of our open-source projects this year focus on building out our data pipelines and tools to help make sense of our data,” wrote LinkedIn’s Igor Perisic in a blog post. “The infrastructure improvements we’ve made in Kafka have allowed us to handle 1.3 trillion messages per day, and Espresso now serves 2.2 million rows per second.”

LinkedIn open-sourced its Pinot real-time analytics infrastructure last June. The technology allows LinkedIn to sort through and analyze enormous amounts of data in real-time for a wide variety of its products.

“At LinkedIn, we have a large deployment of Pinot storing hundreds of billions of records and ingesting over a billion records every day,” said Kishore Gopalakrishna, a senior software engineer at LinkedIn, in a blog post describing how it works. “Pinot serves as the backend for more than 25 analytics products for our customers and members. This includes products such as Who Viewed My Profile, Who Viewed My Posts and the analytics we offer on job postings and ads to help our customers be as effective as possible and get a better return on their investment. In addition, more than 30 internal products are powered by Pinot…”

LinkedIn also open-sourced its lightweight PalDB technology for storing side data last October. As Linkedin engineer Matthieu Monsch explains, side data is the extra read-only data needed by a process to do its job, such as a list of stop words used by a natural language processing algorithm, or machine learning models used in machine translation, content classification or spam detection are also side data. The problem is that when this side data becomes too large, it creates bottlenecks for applications that depend on it. PalDB was built to provide a read-only embeddable database that makes it easier to scale side data.

In his blog post, Perisic said he believes that LinkedIn’s engineers benefit from open-sourcing their projects because it means their work is exposed to the entire developer community.

“It seems paradoxical to think that developers write better software for others than they do for themselves, but it actually makes sense,” Perisic wrote. “When software is written ‘internally,’ developers have a tendency to cut some corners—and I’m as guilty as anyone—especially around documenting, making code easily readable and reusable and having all the right tests in order.”

“With open source, developers’ names are attached to the software they create and the entire community can look at it,” he continued. “This puts a human face on code and reputations on the line. Once a developer open sources some software, their names will be forever associated with it. This is a huge incentive to cross their T’s and dot their I’s. A developer wants to be associated with good stuff that is well written.”

Image credit: Minachan via pixabay.com

A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

LinkedIn reflects on its open-source successes in 2015

Image credit: Minachan via pixabay.com

A message from John Furrier, co-founder of SiliconANGLE:

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

RECENT CUBE EVENTS

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

LinkedIn reflects on its open-source successes in 2015

Image credit: Minachan via pixabay.com

A message from John Furrier, co-founder of SiliconANGLE:

LATEST STORIES

LATEST STORIES

Appian World 2026

Google Cloud Next 2026

Phi Moments @ Next 2026

SUSECON 2026

Oracle Data Deep Dive NYC 2026

Cookies