UPDATED 18:46 EDT / DECEMBER 22 2021

CLOUD

AWS cloud suffers a third major outage this month

Amazon Web Services Inc.’s cloud computing operation suffered another outage Wednesday morning – its third this month – causing a huge number of online services to briefly shut down.

AWS reported through its status page it had suffered a loss of power at a data center in Northern Virginia, leading to connectivity issues that began around 7.30 a.m. EST. It caused disruption to numerous services ranging from the messaging app Slack to the video games store of Epic Games Inc.

Other services affected included the cryptocurrency exchange Coinbase Global Inc., the gaming company Fortnite Inc., dating app Grindr and the delivery company Instacart Inc.

The outage was quickly fixed and normal services had resumed by around 10 a.m., but AWS will no doubt be concerned that today’s incident was the third to affect its cloud this month. Two weeks ago, service problems that were later blamed on malfunctioning network devices took out multiple services across the U.S. ranging from Netflix and Disney+ to connected devices such as Amazon.com Inc.’s Ring security cameras and iRobot Corp’s Roomba vacuums.

AWS was down for around five hours before the problems were fixed, only to suffer a second minor incident last week.

AWS is the world’s biggest cloud infrastructure service, which allows companies to rent computer servers and processing power instead of buying and managing their own. Cloud services have revolutionized the internet in many ways with their promise of a reliable online backbone that’s always available.

However, outages such as today’s underscore how this consolidation of the internet’s previously distributed nature can lead to big problems when a single failure occurs at the wrong moment.

Maintaining an enormous cloud of global data centers and ensuring they remain online at all times is not an easy task. AWS employs thousands of engineers who need to test each change they make to the underlying infrastructure before it’s deployed, then closely monitor it afterwards. That usually involves creating an automatic way to back out and revert to the previous configuration in case something goes wrong.

Analyst Charles King of Pund-IT Inc. told SiliconANGLE that technical issues and power outages are unfortunately just a fact of life for most data center operators. The problem is that when it comes to major public cloud providers such as AWS, their reliability issues result in disruption for numerous other businesses.

“Along with being embarrassing for AWS and painful for its customers, these outages are also bringing into question the company’s assurances of being a prime enterprise vendor,” King said. “It’s usually a mistake to judge a person or organization by worst case events. But when those problems impact core business functions, they can leave customers understandably concerned.”

Holger Mueller of Constellation Research Inc. is also concerned. He said his worry is that AWS is supposedly built to withstand power loss, so the issue should not have taken services down.

“It’s also worrying that this issue occurred once more in the venerable US EAST region, as the previous two outages did, “Mueller said. “So one has to ask, why are so many issues coming out of that region?”

AWS said in a postmortem of the five-hour outage on Dec. 7 that it was caused by a glitch in automated software, leading to “unexpected behavior” which then “overwhelmed” several key AWS networking devices.

The second outage, which lasted less than an hour on Dec. 15, was reportedly caused by “network congestion” as a result of internal engineering that inadvertently shifted “more traffic than expected to parts of the AWS backbone that affected connectivity,” the company explained.

As for the latest incident, the company is yet to publish a full postmortem. No doubt customers will be eagerly waiting for an explanation as to why the power issues AWS is supposedly immune to managed to cause such serious problems.

Image: AWS

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU