Skip to main content

99.9% Uptime SLAs: What They Actually Guarantee

99.9% uptime allows 8+ hours of downtime per year. Learn what SLA fine print means and how to evaluate vendor promises.

FlareWarden Team
8 min read

“We guarantee 99.9% uptime.”

It sounds impressive. It sounds like near-perfection. It’s become such a standard claim that most businesses nod along without questioning what it actually means.

Here’s what most people don’t realize: 99.9% uptime allows for over 8 hours of downtime per year. And that might be the least surprising thing hiding in your vendor’s SLA.

The Math Behind the Nines

The difference between uptime percentages isn’t linear - it’s logarithmic. Each additional “nine” represents a 10x improvement in availability. Here’s what each level actually permits:

Uptime %Common NameDowntime/YearDowntime/MonthDowntime/Day
99%Two nines3.65 days7.31 hours14.4 minutes
99.9%Three nines8.76 hours43.8 minutes1.44 minutes
99.99%Four nines52.6 minutes4.38 minutes8.6 seconds
99.999%Five nines5.26 minutes26.3 seconds0.86 seconds

Source: uptime.is

A simple way to remember: five nines allows approximately 5 minutes of downtime per year. Each fewer nine multiplies that by 10.

That “99.9% guarantee” your hosting provider advertises? It means they can be down for 43 minutes every single month and still technically meet their SLA.

What the Major Cloud Providers Actually Promise

Let’s look at what the big three actually guarantee:

Amazon Web Services

AWS’s compute SLA offers different guarantees based on your architecture:

  • Multi-AZ deployments: 99.99% (52 minutes/year)
  • Single-instance in one AZ: 99.5% (1.83 days/year)

That’s a massive difference. If you’re running a single EC2 instance without redundancy, AWS only promises to be up 98.2% of the time - allowing for nearly 44 hours of downtime per year.

Microsoft Azure

Azure’s VM SLA similarly varies by configuration:

  • VMs across Availability Zones: 99.99%
  • Single-instance VMs with premium storage: 99.9%

Google Cloud

Google Compute Engine promises 99.99% for instances deployed across multiple zones.

The pattern is clear: high availability SLAs require you to architect for redundancy. A single server with a 99.9% SLA has a very different risk profile than a distributed system with 99.99%.

The Fine Print That Changes Everything

Here’s where SLAs get interesting - and by interesting, I mean concerning.

Exclusions That Void the Guarantee

Most cloud provider SLAs include exclusions that can void your protection entirely. Common exclusions include:

  • Force majeure events - Natural disasters, wars, government actions
  • Internet access problems - Issues outside the provider’s network
  • Customer actions or inactions - Including configuration errors
  • Customer equipment or software - Problems in your code or infrastructure
  • Scheduled maintenance - If they notify you in advance, it doesn’t count
  • Third-party failures - Services they depend on but don’t control

That last one is particularly important. If your website goes down because of a DNS provider failure, your hosting company may argue it wasn’t their fault - even though your customers still couldn’t reach you.

Architecture Requirements

This is the gotcha that catches many businesses: credits only apply if you’ve architected correctly.

If your application goes down because you deployed a single-instance architecture in one availability zone and that zone has an outage, AWS met its SLA for multi-AZ deployments - you just didn’t use it.

The 99.99% guarantee often requires:

  • Deployment across multiple availability zones
  • Proper load balancing configuration
  • Redundant database instances
  • Specific storage configurations

Running a simpler architecture? You’re likely covered by a much lower SLA than the headline number suggests.

How Uptime Is Measured

Different providers measure “unavailability” differently:

  • Network-level availability - The server is reachable
  • Service-level availability - The application responds
  • End-user availability - Customers can actually use the service

A provider might measure their SLA at the network level while your users experience application-level problems. The infrastructure is “up” by their definition while your business is effectively down.

Request Requirements

Here’s something many businesses don’t realize: SLA credits aren’t automatic.

Google Cloud requires customers to notify technical support within 60 days and provide log files showing downtime periods with dates and times. Failure to comply forfeits your right to receive credits.

Most providers require you to:

  1. Detect and document the outage yourself
  2. File a support ticket within a specified window
  3. Provide evidence of the downtime
  4. Wait for the provider to validate your claim

If you don’t have monitoring in place to detect and document outages, you may never know you were eligible for credits.

The Compensation Gap: Why SLA Credits Don’t Cover Your Losses

Let’s talk about what happens when the SLA is breached.

The Math of SLA Credits

Here’s a realistic scenario: Your business uses a small AWS instance costing $3 per month. The instance goes down for 6 hours due to a provider issue.

Since the monthly uptime is still above 99%, you receive 10% of your monthly bill as credit - approximately 30 cents.

Meanwhile, those 6 hours may have cost your business thousands in lost sales, damaged reputation, and customer support overhead.

Even in a worst-case scenario where the resource is down for more than 36 hours, you’d only receive a full refund of that resource’s monthly cost - still nothing compared to actual business losses.

Real-World Disparity

One analysis found a case where a SaaS provider offered $3,200 in service credits for an outage that caused over $2 million in actual customer losses - roughly 0.15% of the real impact.

This isn’t an anomaly. With downtime costing large businesses an average of $9,000 per minute, while SLA credits are capped at monthly subscription fees, the gap between compensation and losses is inherent to how SLAs are structured.

The “Sole Remedy” Clause

Most enterprise SLAs include language making credits your “sole and exclusive remedy” for any unavailability. This limits your legal recourse and ensures you can’t seek damages beyond the credit amount.

The SLA isn’t designed to make you whole after an outage. It’s designed to incentivize the provider to maintain service and serve as a marketing signal that they take reliability seriously.

The 100% Uptime Myth

Some providers advertise “100% uptime guarantees.” Be skeptical.

One analysis found a SaaS provider promising “100% uptime” whose SLA only provided compensation after 0.05% downtime per month - more than 20 minutes of allowed downtime.

True 100% uptime is practically impossible. Even the most reliable systems experience occasional issues. A provider claiming 100% is either:

  • Using creative definitions of “uptime”
  • Hiding the real terms in fine print
  • Making a promise they can’t keep

What Actually Matters When Evaluating SLAs

Given all these caveats, here’s how to actually evaluate vendor reliability:

1. Look at Track Record, Not Just Promises

Historical uptime data matters more than SLA promises. Ask vendors for:

  • Actual uptime statistics over the past 12-24 months
  • Incident history and post-mortems
  • Status page transparency

A vendor with a 99.9% SLA who has actually delivered 99.99% is better than one promising 99.99% with a history of outages.

2. Understand the Architecture Requirements

Ask specifically:

  • What architecture is required to qualify for the headline SLA?
  • What’s the SLA for single-region or single-zone deployments?
  • What configurations void the guarantee?

3. Read the Exclusions

Identify what’s explicitly excluded:

  • Scheduled maintenance windows
  • Third-party dependencies
  • “Customer-caused” issues
  • Force majeure

4. Understand the Claim Process

Know before you need it:

  • How long do you have to file a claim?
  • What documentation is required?
  • How is “downtime” measured?

5. Calculate Your Actual Risk

Do the math for your business:

  • How much does an hour of downtime cost you?
  • How does that compare to maximum SLA credits?
  • What’s your risk exposure beyond what’s covered?

Building Your Own Safety Net

Given that SLAs are marketing tools more than insurance policies, smart businesses build their own protection:

Monitor Independently

Don’t rely on your provider’s status page to know when there’s a problem. External monitoring from multiple locations gives you:

  • Early warning of issues
  • Documentation for SLA claims
  • Data your provider might not report

Architect for Failure

Assume things will break. Design systems that:

  • Span multiple availability zones
  • Have automated failover
  • Can operate in degraded mode
  • Recover quickly from failures

Have a Backup Plan

For critical services, consider:

  • Multi-cloud redundancy for essential systems
  • Geographic distribution across providers
  • Documented procedures for provider failures

Document Everything

When outages occur:

  • Log exact start and end times
  • Screenshot error messages and status pages
  • Save any communication from the provider
  • Calculate actual business impact

The Bottom Line

An SLA is not insurance. It’s not a guarantee that you won’t experience downtime. It’s a baseline commitment with significant limitations, exclusions, and caps on compensation.

The 99.9% uptime guarantee that sounds impressive allows for 8+ hours of annual downtime, may not cover single-server deployments, excludes many common failure scenarios, and compensates you with pennies when breached.

Smart businesses treat SLAs as one data point among many when evaluating vendors - not as protection against the real cost of downtime.

The only reliable protection against downtime impact is your own preparation: independent monitoring, resilient architecture, and business continuity planning that doesn’t depend on vendor compensation.


Want to know when your services are actually down - regardless of what your vendor’s status page says? FlareWarden monitors your infrastructure from outside your network and alerts you immediately, giving you the documentation you need for SLA claims and the early warning you need to respond.