Etsy is known for helping people “make a living making things”. The engineers at Etsy also specialize in making things, particularly when it comes to managing the website’s servers. While some companies have embraced fancier logging tools like Flume, Scribe, or Logstash, Etsy has decided to stick with a tried and true technology: Syslog-ng. The engineers have perfected it for the company’s large-scale needs with some clever performance tuning.
Etsy runs syslog on its own server and configures all of its other servers to send log data to the syslog server. The central syslog server is an 8-core system with 12 GB of RAM. It currently absorbs approximately 60,000 events per second and peaks at about 25 percent of the server’s CPU load. These events may include Apache web server access and error logs, Squid proxy logs, and other important server information.
Some examples of the performance tuning Etsy is doing include:
- Rule ordering – Rather than using filters to sort through everything sent from the servers, Etsy makes use of Chef to automate role matching and sorting of logs. By organizing syslog rules according to importance, you can greatly reduce CPU load. Therefore, parse the most frequently logged syslog-ng lines first rather than last.
- Low windows and buffer sizing – Using something called “flow control”, you can use syslog-ng to buffer data when your servers send more traffic than the program can handle.
- Try disabling power saving features – Find power settings that work best for your server. Etsy was able to greatly reduce power consumption by adjusting these settings. Experiment and see what works.
Overall, Etsy was able to reduce CPU consumption from 70 percent to 15 percent with a little research and testing to find the best ways to aggregate server logs. All of the technical details, including graphs, are available on the Etsy developer blog.