PagerDuty was founded by three Amazon.com operations veterans to solve what at first seems like a simple problem: sending server alerts to the responsible person. PagerDuty’s “secret sauce” is that it not only aggregates notifications from several monitoring and alerting sources, but it can manage on-call rotation schedules to route those phone or SMS alerts to the right person.
It features out of the box integrations with Nagios, Zenoss, Pingdom, Splunk, Munin, Monit, Cloudkick (now part of Rackspace) and Keynote. PagerDuty integrates with the Networked Helpdesk API, and offers an API of its own for creating further integrations.
The software de-duplicates alerts, so if a problem is affecting 200 servers, only one alert is sent to the person on-call to notify them that there were 200 incidents.
PagerDuty’s David Hayes says the software assigns each alert to a sequence of people – only one at a time. If the first person does not acknowledge the alert from PagerDuty within a certain amount of time, the case is escalated to another person until someone responds. PagerDuty doesn’t do group alerts, that way a person receiving a particular alert knows that it is their responsibility to either act or not act on a particular alert.
The company was founded in 2009 by Amazon.com Alex Solomon, Baskar Puvanathasan, Andrew Miklas, who all actually carried pagers at Amazon.com. In 2010 the company was accepted into YCombinator. The service is now used by some of the most well known DevOps shops around, including Etsy, Heroku, Netflix and Opscode.