

Practically every important application in the enterprise today comes with safeguards designed to protect against common technical problems like server outages. But there’s often a big difference between how a service is expected to handle an issue and how it does so in practice, which requires organizations to painstakingly test for any weak points that may have slipped through their quality controls. LinkedIn Inc. moved to ease the chore this week by open-sourcing the homegrown system that its engineers use internally to assess the resilience of its infrastructure.
The company developed Simoorg, as the software is called, after finding older failure induction technologies like Chaos Monkey (the brainchild of fellow web giant Netflix Inc.) to be inadequate for its purposes. LinkedIn needed a tool that can not only check how well a workloads deal with technical trouble in general, but also simulate specific operational conditions where its internal processes are likely to run into trouble. That includes every small detail down to the amount of traffic an application handles and how much latency it’s experiencing.
Simoorg also provides the ability to customize the way a test is carried out to ensure that it’s reflective of what a real-life outage would look like. An engineer could point the system at a certain group of servers, set how long each machine will be taken offline and then specify the precise sequence in which the process should be executed. LinkedIn even included the option to have hardware components disabled at a random order, an addition that makes it possible to check how an application performs in situations that the IT department can’t necessarily anticipate.
The versatility of Simoorg enables organizations to simulate everything from the effects of a bad patch to severe hardware failures spread out throughout an entire data center. Its customizability also allows for tests to be tweaked with relative ease, which gives users the ability to explore more nuanced issues like whether a service’s susceptibility to hardware outages increases above a certain traffic threshold. The knowledge gleaned using the system is useful both for developers looking to improve the resilience of their applications and operations professionals charged with troubleshooting problems with their organizations’ infrastructure.
Support our open free content by sharing and engaging with our content and community.
Where Technology Leaders Connect, Share Intelligence & Create Opportunities
SiliconANGLE Media is a recognized leader in digital media innovation serving innovative audiences and brands, bringing together cutting-edge technology, influential content, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — such as those established in Silicon Valley and the New York Stock Exchange (NYSE) — SiliconANGLE Media operates at the intersection of media, technology, and AI. .
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a powerful ecosystem of industry-leading digital media brands, with a reach of 15+ million elite tech professionals. The company’s new, proprietary theCUBE AI Video cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.