DNS: The Invisible Outage That Takes Down Businesses

October 21, 2016. A Friday morning that started like any other.

Then, across the United States, the internet started breaking. Not slowly - all at once. Twitter wouldn’t load. Netflix was down. Spotify, silent. Reddit, unreachable. PayPal, Amazon, Airbnb, The New York Times, GitHub - all inaccessible.

The strange part? Every one of these services’ servers was running perfectly. Their code was fine. Their infrastructure was healthy.

The problem was something most people had never heard of: a massive DDoS attack on Dyn, a DNS provider that served as the “phone book” for a significant portion of the internet.

Without DNS, it didn’t matter that these websites were online. Nobody could find them.

The Phone Book of the Internet

To understand why DNS failures are so devastating, you need to understand what DNS does.

When you type “amazon.com” into your browser, your computer doesn’t actually know where Amazon is. It needs to ask: “What’s the IP address for amazon.com?” That question goes to a DNS server, which responds with something like “52.94.236.248” - the actual location of Amazon’s servers.

This lookup happens for every website, every time. And it happens in milliseconds, invisibly, countless billions of times per day.

DNS is the translation layer between human-readable domain names and machine-readable IP addresses. Without it, the internet essentially stops working for anyone who doesn’t have IP addresses memorized.

The Single Point of Failure

Here’s the alarming reality: according to CSC’s 2024 Domain Security Report, only 17% of companies in the Global 2000 employ DNS redundancy.

That means 83% of the world’s largest companies have DNS as a single point of failure.

When that single point fails, everything fails - even if every other system is operating perfectly.

DNS Configuration Errors: The 70% Problem

Not every DNS outage comes from external attacks. DNS configuration errors account for approximately 70% of all DNS-related downtime incidents.

These errors include:

Typos in DNS records - Mastercard had a typo in their nameserver records for nearly five years, referencing “akam.ne” instead of “akam.net”
Expired domains - Forgetting to renew a domain means your entire online presence vanishes
Incorrect TTL settings - Making changes take hours or days longer than necessary
Missing or misconfigured MX records - Breaking email without realizing it
Dangling CNAME records - Creating security vulnerabilities attackers can exploit

A single misplaced character can make your entire business unreachable.

The Cost of DNS Failure

The financial impact of DNS outages is staggering.

According to the 2020 Global DNS Threat Report, 79% of organizations experienced DNS attacks, with the average cost of each attack standing at $924,000. In North America, that figure rises to $1,073,000 per attack.

For prolonged outages, the numbers get worse. A 16-hour DNS resolution issue cost impacted companies an average of $8.64 million each.

And these attacks are frequent. Organizations face an average of more than 9.5 DNS attacks per year.

The Cascade Effect

DNS failures don’t just affect websites. They cascade through every connected system:

Email stops working - MX records resolve through DNS
APIs fail - Services can’t find each other
Authentication breaks - OAuth and SSO rely on DNS
CDNs become unreachable - Even cached content becomes inaccessible
Internal tools go dark - If they rely on domain names

When AWS experienced a DNS resolution failure on October 20, 2025, affecting their us-east-1 region, the cascade brought down services across the internet. The root cause wasn’t server hardware or software - it was DNS and DynamoDB API failures that triggered a chain reaction affecting millions of users globally.

When the Internet Broke: Major DNS Incidents

The Dyn Attack (2016)

The Dyn DDoS attack remains one of the most significant internet outages in history. Attackers used a botnet of compromised IoT devices - cameras, smart TVs, baby monitors - running malware called Mirai to flood Dyn’s servers with traffic.

The attack disrupted access to Twitter, Netflix, Spotify, PayPal, Amazon, Reddit, GitHub, and countless other services. It demonstrated how concentrated DNS infrastructure creates systemic risk for the entire internet.

Cloudflare November 2025

On November 18, 2025, Cloudflare suffered a global outage that affected roughly one in five webpages at its peak. One-third of the world’s 10,000 most popular websites became inaccessible, including X (formerly Twitter), ChatGPT, Spotify, Canva, Zoom, and Coinbase.

The root cause wasn’t an attack - it was a database permissions change that caused duplicate rows in a configuration file. A simple internal error brought down a significant portion of the internet.

Cloudflare July 2025

On July 14, 2025, Cloudflare’s 1.1.1.1 public DNS resolver went down for 62 minutes. Any website relying solely on Cloudflare’s DNS became unreachable for users whose resolvers cached Cloudflare’s infrastructure.

The Pattern

These incidents share a common theme: concentration of DNS infrastructure creates outsized risk. When everyone relies on the same few providers, a single failure affects millions.

The DNS Propagation Problem

Even when DNS is working correctly, changes don’t happen instantly. This is called DNS propagation - the time it takes for DNS changes to spread across the internet.

Understanding TTL

Every DNS record has a TTL (Time To Live) - a value in seconds that tells DNS resolvers how long to cache that record before checking for updates.

Common TTL values:

TTL Setting	Duration	Use Case
300 seconds	5 minutes	Dynamic/frequently changing records
3600 seconds	1 hour	Standard websites
86400 seconds	24 hours	Stable infrastructure

When you change a DNS record, it can take up to 48-72 hours for all DNS servers worldwide to reflect the update. During this window, some users see the old record, some see the new one.

Why Propagation Takes So Long

Several factors affect propagation time:

Your current TTL setting - If your TTL is 24 hours, resolvers won’t check for updates for 24 hours
ISP caching behavior - Some ISPs ignore TTL and cache longer than they should
Geographic distribution - Different regions update at different rates
Resolver diversity - Users on different networks may see different results

The Propagation Strategy

For planned DNS changes, best practice is to lower your TTL to 30-300 seconds at least 48 hours before making changes. This ensures that when you make the actual change, the old cached records expire quickly.

For emergency situations - like a DDoS attack requiring IP address changes - a high TTL means you’re stuck with the problem for hours or days while propagation completes.

DNS Security: The Attacks You Don’t See

Beyond outages, DNS is increasingly targeted by sophisticated attackers.

DNS Hijacking

According to security research, 47% of organizations have experienced DNS hijacking - where attackers redirect your domain to their servers. Victims think they’re visiting your website, but they’re actually on an attacker-controlled page designed to steal credentials or distribute malware.

In 2025, a threat actor called Hazy Hawk exploited abandoned DNS records to hijack subdomains belonging to the CDC, Deloitte, PwC, and Ernst & Young. The attackers exploited “dangling” CNAME records pointing to abandoned cloud resources.

DNS Poisoning

DNS poisoning attacks corrupt the DNS cache, causing legitimate queries to return malicious IP addresses. A China-linked campaign ran DNS poisoning attacks for over two years (2022-2024), compromising an ISP to push malicious software through fake updates.

The Scale of DNS Attacks

The numbers are sobering:

Nearly 90% of organizations experienced DNS attacks in the past year
The average cost per attack: approximately $950,000
In Q1 2024 alone, there were 1.5 million DNS DDoS attacks
82% of businesses suffered application outages from DNS intrusions

Why Standard Monitoring Misses DNS Issues

Here’s the insidious part about DNS failures: your internal monitoring often can’t detect them.

If your monitoring system is inside your network, it likely has the correct DNS information cached or uses internal DNS servers. It will report that everything is fine while customers worldwide can’t reach you.

DNS problems are often:

Regional - Affecting some geographic areas but not others
ISP-specific - Affecting users on certain networks
Resolver-dependent - Different DNS providers show different results
Intermittent - Appearing and disappearing as caches expire

The only way to catch DNS issues is to monitor from outside your infrastructure, from multiple geographic locations, using the same DNS resolution path your customers use.

Protecting Your Business from DNS Failure

Given the critical role DNS plays, what can businesses do to reduce risk?

1. Use Multiple DNS Providers

Don’t rely on a single DNS provider. If Cloudflare goes down and that’s your only DNS, you go down too. Configure secondary DNS providers as backup.

2. Monitor DNS Externally

Monitor your DNS records from outside your network. Check that your domain resolves correctly from multiple geographic locations and multiple DNS resolvers.

3. Keep TTLs Appropriate

Use lower TTLs (5-15 minutes) for records you might need to change quickly
Never use 24+ hour TTLs for critical production records
Lower TTLs before planned changes

4. Audit DNS Records Regularly

Remove unused records (especially CNAMEs pointing to abandoned services)
Verify all records point where they should
Check for unauthorized changes

5. Secure Your Domain Registrar

Your domain registrar is the keys to your kingdom. Enable:

Two-factor authentication
Domain lock
Alert notifications for any changes
Registry lock for high-value domains

6. Plan for DNS Incidents

Have a playbook for DNS failures:

How will you detect the problem?
Who has access to make DNS changes?
What’s your communication plan during an outage?
How long will propagation take for emergency changes?

The Invisible Foundation

DNS is the invisible foundation of the internet. When it works, nobody thinks about it. When it fails, nothing else matters.

The businesses that stay online aren’t the ones who assume DNS will always work. They’re the ones who acknowledge its fragility and build redundancy, monitoring, and incident response around it.

Your servers might be running perfectly. Your code might be flawless. But if nobody can find you, none of that matters.

Don’t let DNS be your blind spot. FlareWarden monitors your DNS records from multiple global locations and alerts you when resolution fails - before your customers notice.