UPDATED 15:12 EDT / JUNE 18 2012

Build Your Own Splunk-Like Central Log Management Tool With Open Source Software

In the age of big data, log management is becoming an absolute necessity, as developers, operations, and, yes, DevOps have to deal with and process huge amounts of machine-generated data. Many organizations have turned to Splunk, a pioneer in the space, to help manage the rising tide of log data – but Splunk can get really, really expensive.

There’s still not a single general-purpose alternative to Splunk. But over the weekend,  Booking.com SysAdmin Brad Lhotsky documented his quest to build his own central log management system using only open source software.

Of course, his blog entry contains much deeper technical insight, but at the high level, he broke his solution down into three components: Log centralization (rsyslog), log management (logstash/Kibana) and log visualization (Graphite).

Rsyslog was tapped for log centralization over similarly popular alternative syslog-ng because the former offers guaranteed delivery and encrypted transfer in the open source edition – two features that Lhotsky says are becoming of increased importance to regulatory compliance auditors. With rsyslog, Lhotsky was able to build a reliable way to transport event logs from Unix hosts to a central repository.

This is where Lhotsky starts entering Splunk’s territory, calling the company “the 1,000 lb Gorilla in the room.” But in lieu of Splunk, Lhotsky writes that he took the MongoDB-powered Graylog2 for a test drive before settling on logstash. Graylog2 is great, he says, but suggests that its ElasticSearch indexing scheme is “broken,” and if you have to keep a large amount of logs around for compliance reasons, you’re going to take a performance hit. Lhotsky goes so far as to speculate that it’s because Graylog2 only implemented ElasticSearch for, well, search fairly late in the game.

On the other side of the coin, logstash also uses ElasticSearch, but with far more of a focus on scalability, inputs, filter and outputs. The cost, Lhotsky writes, is a polished front-end. Enter Kibana, a PHP front-end for logstash that takes the ElasticSearch indexes and adds a front-end for search and analysis, making the whole platform a lot more usable.

“Kibana fills the gap with the Logstash interface so perfectly. It doesn’t give me everything I’d get with Splunk, but I’ve just touched the functionality I can extract with Logstash,” as Lhotsky puts it.

Finally, he suggests the popular Graphite for data visualization and graphing all the log data you’ve now collected.

As Lhotsky says, this just how he tried to match Splunk-like functionality with open source tools, and it’s still a work in progress. He’s on Twitter if you want to talk to him about this implementation directly, or else leave a comment below.

Regardless, there’s a definite and growing need for log management tools, and I’m wondering why Splunk is still relatively unchallenged in the space.

 


A message from John Furrier, co-founder of SiliconANGLE:

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.
About SiliconANGLE Media
SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.