Yesterday, the Internet saw what can happen when a DNS outage strikes a major webhost when millions of people could not access web pages provided by Go Daddy. Initially, reports went out that Anonymous had taken credit for a DDoS attack against the Go Daddy’s DNS infrastructure and thus caused intermittent failures for numerous websites—but today we’ve heard from the “daddy” himself that it was caused by a critical failure internal to the webhost and not an external attack.
“It was not a ‘hack’ and it was not a denial of service attack (DDoS),” GoDaddy Interim CEO Scott Wagner wrote in an e-mail to Ars Technica. “We have determined the service outage was due to a series of internal network events that corrupted router data tables.”
Service was interrupted beginning at 10 a.m. PT and being fully restored about 4 p.m. PT.
“Throughout our history, we have provided 99.999% uptime in our DNS infrastructure,” said Wagner. “This is the level our customers expect from us and the level we expect of ourselves. We have let our customers down and we know it. We take our business and our customers’ businesses very seriously. We apologize to our customers for these events and thank them for their patience.”
Of course, during the six hours that these effects and aftershocks were felt across the Internet a cell of Anonymous crawled out of the woodwork to claim credit for the outage. It was quickly reported here at SiliconANGLE that the outage was the result of a DDoS attack—and that sort of claim would have been credible: it’s well known that it’s possible to knock out DNS infrastructure with distributed denial of service attack.
Although there have been fears that the current global DNS service is susceptible to this sort of attack, this is unlikely because it’s a lot harder to take the entire thing down than just one portion. The lesson learned from GoDaddy’s outage, however, suggests that large enough vendor can generate a great deal of mayhem due to poor configuration as the result of a failure or a DDoS attack. After all, reports are that millions of users found themselves unable to connect to web pages because the DNS failed to give them the appropriate addresses. The company hosts 5 million web sites and manages a total of 52 million domain names—all of these web sites would have been affected in some form or another.
I personally found myself unable to access SiliconANGLE.com (although my colleagues were largely unaffected) with ongoing, intermittent DNS failures for most of yesterday.
Go Daddy has replied that they do use redundant systems to prevent this sort of thing. Although the DNS infrastructure uses a tree-like hierarchy with higher level tiers providing addressing for lower levels, it is possible to have backups that can come online if primary fail to respond in time. That both the primary and redundant systems at Go Daddy were affected at the same time by a router failure seems to suggest that they may not keep them as separate as they should.
A single (or even multiple interrelated) router failure should not be able to cause wide disruption to an entire DNS infrastructure if the redundancy is well designed.