The October 20 AWS Outage

A Monday Morning That Broke the Internet

On the morning of Monday 20 October 2025, millions around the world opened their devices to find their favourite apps, websites and even critical services completely offline. From HMRC and Lloyds Bank in the UK to Snapchat, Fortnite and Zoom globally, the internet seemed to flicker out of sync.

The cause? A major outage in Amazon Web Services (AWS), the cloud computing division of Amazon, which powers much of the modern internet. The disruption began in the US-East-1 region (Northern Virginia), a central hub that hosts an enormous portion of global traffic.

According to AWS, the fault originated from an issue with DNS resolution for their DynamoDB database service. DNS resolution, short for Domain Name System resolution, is the process that translates website names (like logixal.co.uk) into the numerical addresses computers use to communicate. When that system fails, websites become unreachable, even if the servers themselves are perfectly fine.

Within minutes, the issue cascaded across networks worldwide, demonstrating just how interconnected our online world has become.

A Timeline of the Global Blackout

The incident began around 8 a.m. UK time (3 a.m. Eastern Time), when AWS users reported slow loading and timeouts. By 8:30 a.m., thousands of major services had begun to fail.

At the height of the outage:

Snapchat, Zoom, Ring, Slack and Fortnite were all reported down.
Financial institutions like Lloyds, Halifax and Coinbase experienced service disruptions.
Public services, including HMRC’s online portal, were intermittently inaccessible.
Even Amazon’s own retail site saw delays and intermittent downtime.

As AWS engineers worked to isolate the problem, the company’s status page confirmed “increased error rates and latencies” in multiple systems. By midday, the issue was traced to the DNS subsystem linked to DynamoDB in the US-East-1 region.

By late afternoon, recovery began. AWS announced that services were “returning to normal operations,” but the ripple effect lasted well into the evening for some platforms that rely on cached data and dependent APIs.

Understanding What Went Wrong

So what exactly caused such widespread disruption?

The outage stemmed from a failure in how AWS’s internal systems handled requests to DynamoDB, one of its primary database services. When the DNS component in that process began misrouting or dropping connections, it effectively broke the chain of communication for countless applications.

Because AWS’s US-East-1 region acts as a “control plane” for many global services, a single technical fault there can impact workloads far beyond the US.

Think of it like this: if one control tower at an airport stops communicating properly, flights across other cities could be delayed or grounded simply because they depend on its signals. That’s what happened here, a technical issue in one region disrupted digital “air traffic” everywhere.

AWS has since confirmed that it was not a cyberattack but a technical failure, likely triggered by an internal misconfiguration or routing error.

The Domino Effect: Who Was Affected

The scale of dependency on AWS became instantly clear.

Consumer Platforms

Everyday apps that millions rely on like; Snapchat, Fortnite, Ring, Signal, Zoom, all suffered partial or total outages. Users flooded social media with complaints, highlighting just how invisible cloud providers normally are until something goes wrong.

Financial Services

Banks such as Lloyds, Halifax, and Coinbase saw temporary failures in online banking and transaction systems. These platforms use AWS for parts of their backend infrastructure, meaning when AWS stumbled, so did their customer interfaces.

Public Sector and UK Services

Even HMRC, one of the UK’s most critical online services, was affected. Though disruption was short-lived, it was a clear reminder that essential public systems also depend on third-party cloud providers to stay online.

Business Applications

Video conferencing, file storage and customer portals all saw disruptions as popular tools like Slack, Zoom and RingCentral failed to connect. For businesses, this translated to lost time, frustrated staff, and halted workflows.

Why This Outage Matters So Much

The AWS incident didn’t just knock out websites, it exposed the fragility of global digital infrastructure.

The cloud has revolutionised modern business by providing scalable, flexible, cost-efficient computing power. Yet, this outage served as a powerful reminder that even the biggest cloud platforms have vulnerabilities.

When a single region within a cloud network experiences issues, it can impact millions of users worldwide. This is because most services are built for efficiency, not necessarily for redundancy, the ability to switch over seamlessly if one system fails.

Redundancy, in simple terms, is like having a backup generator when the power goes out. Many systems skip this safeguard to save cost or complexity, until a day like this reminds us why it matters.

Businesses often assume that “cloud” means “always available.” But while uptime across providers like AWS is impressively high, no provider guarantees zero downtime.

Lessons for Businesses: Planning Beyond the Outage

For companies of every size, this event carries valuable lessons about resilience, risk, and readiness.

1. Understand Your Dependencies

Many organisations don’t realise how deeply their services depend on cloud platforms. A single API connection (an interface between systems) can make or break your operations. Mapping those dependencies helps identify hidden risks.

2. Adopt Multi-Region or Multi-Cloud Strategies

Instead of relying on one cloud region or provider, businesses can distribute workloads across multiple locations or platforms.

Multi-cloud means using more than one cloud vendor (for example, AWS + Microsoft Azure), so if one fails, the other keeps operations running.

3. Plan for Graceful Failure

Not every system needs to crash when one service goes down. Smart design can allow websites or apps to run in a “degraded mode” offering limited functionality until full service returns. This preserves business continuity.

4. Test Your Continuity Plans

Backup systems are only useful if they’re tested regularly. Drills, simulations, and recovery tests ensure your failover plans actually work when needed.

5. Communicate Transparently

During outages, silence damages trust. Quick, clear communication both internally and to client, helps maintain confidence, even when systems falter.

A Wake-Up Call for the Digital Age

The 20 October AWS outage may have lasted only hours, but its impact will echo for years. It highlighted how dependent the global economy has become on a few major cloud infrastructures, and how quickly the digital world can stall when one of them falters.

For businesses, it’s not a reason to abandon the cloud, it’s a reason to use it more wisely.

Resilience isn’t built overnight, but every company can take the first step by understanding their cloud architecture, identifying vulnerabilities, and planning for continuity.

Because in today’s world, being “always online” isn’t just a convenience, it’s a responsibility.

If your organisation wants to review its cloud architecture or continuity strategy, Logixal’s experts can help identify vulnerabilities and strengthen resilience.

Click Here