In December, according to reporting by the Financial Times, Amazon Web Services experienced at least two outages in mainland China. The cause, the paper reported, was “errors involving its own employees." 

We’re used to blaming complex systems, rogue updates, and even mysterious “technical anomalies.” In this case, the issue came down to internal mistakes. One incident reportedly involved Amazon’s Kiro AI coding tool, which decided the best fix was to “delete and recreate” the system environment. Amazon’s response, as cited by the Financial Times, framed it as “user error, not AI error.” 

To be clear, this was separate from the large AWS disruption in October that rippled across parts of the internet. An AWS spokesperson described the December incidents as an “extremely limited event” affecting one of two mainland China regions. The second outage, they said, did not impact “customer facing AWS service.” 

Most global users might not have felt the impact of the disruption. But for businesses operating in that region, even a limited outage can stall transactions, delay deployments and shake confidence. When your infrastructure runs on someone else’s servers, you inherit their risks along with their scale. 

That’s the trade-off with cloud computing. Companies outsource hardware headaches and gain flexibility. In return, they depend on processes they can’t see. When those processes fail because of internal missteps, the reassurance that “it was contained” only goes so far. 

Amazon says it has “implemented numerous safeguards” to prevent a repeat. That likely means tighter controls, clearer separation between automated tools and production environments, and more layers of approval. The cloud industry has learned, sometimes the hard way, that guardrails matter as much as speed. 

As AI tools become embedded in development workflows, the line between automation and human oversight gets blurrier. When something breaks, the question shifts from what failed to who was responsible. 

How an AWS Outage Froze Gamers Out of Fortnite and Roblox
What happens when a few hours of downtime erases millions in revenue, and exposes just how dependent gaming is on a single cloud provider?