Amazon.com Inc. said automated processes in its cloud computing business caused an internet blackout this week, affecting everything from Disney theme parks and Netflix videos to robot vacuums and Adele ticket sales.
In a statement Friday, Amazon said the problem started on Dec. 7 when an automated computer program — designed to make its network more reliable — caused a "large number" of its systems to behave unexpectedly. That, in turn, sparked a wave of activity on Amazon's networks, ultimately preventing users from accessing some of its cloud services.
"Basically, a bad piece of code ran automatically and caused a snowball effect," said Forrester analyst Brent Ellis. The outage persisted "because their internal control and monitoring systems were taken offline due to the storm of traffic caused by the original problem."
Amazon explained the outage in a highly technical statement posted online. The issues started on December 7 at 10:30 a.m. New York time and lasted several hours before Amazon was able to resolve the issue. In the meantime, social media was flooded with complaints from consumers angry that their smart home gadgets and other internet-connected services had suddenly stopped working.
Some experts said the explanation doesn't help users fully understand what went wrong.
“They don't explain what this unexpected behavior was and they didn't know what it was. So they were guessing when they were trying to fix it, that's why it took so long," said Corey Quinn, cloud economist at Duckbill Group.
AWS is generally a reliable service. Amazon's cloud division last had a major incident in 2017 when an employee accidentally turned off more servers than intended during repairs to a billing system. Still, the latest outage reminded the world how many products and services are centralized in communal data centers operated by just a handful of major tech companies like Amazon, Microsoft Corp. and Google from Alphabet Inc.
"We know that this event has affected many customers in a significant way," the company said in its jargon-filled statement. “We will do everything we can to learn from this event and use it to further improve our availability.”