What Is an SLA vs SLO vs SLI? (Clear Comparison)

The Quick Answer

Term	What It Is	Example
SLI	A measurement	"Availability is 99.95%"
SLO	A target	"Availability should be ≥ 99.9%"
SLA	A promise with consequences	"If availability < 99.5%, customer gets 10% credit"

SLI = What you measure
SLO = What you aim for internally
SLI = What you promise externally

SLI: Service Level Indicator

An SLI is a metric that quantifies some aspect of your service.

Common SLIs

Availability

SLI = (Successful requests / Total requests) × 100

Example: 99.95% of requests succeeded

Latency

SLI = Response time at a percentile

Example: p99 latency is 180ms

Error Rate

SLI = (Failed requests / Total requests) × 100

Example: 0.1% of requests failed

Throughput

SLI = Requests per second

Example: System handles 5,000 RPS

What Makes a Good SLI?

Good SLIs are:

Measurable: You can collect the data
Meaningful: They reflect user experience
Actionable: You can improve them
Specific: Clear definition, no ambiguity

Bad SLI: "The system is fast" Good SLI: "p95 response time in milliseconds"

SLIs Measure User Experience

The best SLIs measure what users actually experience:

Can they reach the service?
Is it responding quickly?
Are requests succeeding?

Internal metrics (CPU, memory) aren't SLIs—they're diagnostic data. SLIs measure outcomes, not internals.

SLO: Service Level Objective

An SLO is a target value for an SLI. It defines "good enough."

SLO Examples

SLI	SLO
Availability	≥ 99.9%
p99 latency	≤ 200ms
Error rate	≤ 0.1%
Data freshness	≤ 5 minutes stale

SLOs Are Internal Targets

SLOs are for your team, not your customers. They define:

When to prioritise reliability work
When to slow down feature development
When the service is "healthy enough"

Setting Good SLOs

Too aggressive: 99.99% availability when you can barely hit 99.5%

Too loose: 95% availability when users expect 99.9%

Good SLOs are:

Achievable with current architecture
Aligned with user expectations
Ambitious enough to drive improvement

Error Budgets

If your SLO is 99.9% availability, you have an error budget of 0.1%.

Over a month (43,200 minutes):

0.1% budget = 43.2 minutes of allowed downtime

Error budgets let you:

Ship features (spending budget on risk)
Prioritise reliability (when budget is exhausted)
Have objective conversations about trade-offs

SLA: Service Level Agreement

An SLA is a contract that specifies consequences for missing service levels.

SLA Examples

"If monthly availability falls below 99.5%, affected customers receive a 10% service credit."

"If p95 latency exceeds 500ms for more than 1 hour, customer may terminate without penalty."

SLAs vs SLOs

Aspect	SLO	SLA
Audience	Internal team	External customers
Consequences	Prioritisation decisions	Financial/legal penalties
Typical level	Stricter	Looser
Negotiation	Engineering decision	Business/legal decision

Why SLAs Are Looser Than SLOs

Smart companies set SLAs below their SLOs:

SLO: 99.9% availability (internal target)
SLA: 99.5% availability (external promise)

This buffer means:

You can miss your SLO without SLA violations
Customers still get reliable service
You have room for unexpected issues

SLAs Need Teeth

An SLA without consequences isn't an agreement—it's marketing.

Real SLAs define:

What's measured and how
The threshold for violation
What happens when violated (credits, refunds, termination rights)
How violations are reported and claimed

How They Work Together

SLI (Measurement)
    ↓
SLO (Target)
    ↓
SLA (Promise)

Example: An API Service

SLI Definition:

Availability = (2xx responses) / (total responses)
Latency = p99 response time
Measured every minute, aggregated monthly

SLO Targets:

Availability ≥ 99.9%
p99 latency ≤ 150ms

SLA Promise:

Availability ≥ 99.5% or 10% credit
p99 latency ≤ 300ms or 5% credit

The Flow

You measure availability (SLI): Currently 99.85%
You compare to target (SLO): Below 99.9%, need attention
You check against promise (SLA): Above 99.5%, no violation

Even though you missed your internal target, customers aren't impacted from an SLA perspective.

Common Mistakes

Mistake 1: No SLIs

Setting SLOs without clear measurement definitions leads to arguments about whether you're meeting them.

Fix: Define exactly how each SLI is calculated before setting SLOs.

Mistake 2: SLOs = SLAs

If your SLO equals your SLA, every near-miss is an SLA violation.

Fix: Build a buffer. SLA should be achievable even when you miss SLO.

Mistake 3: Too Many SLOs

Tracking 50 SLOs means none get focus.

Fix: 3-5 SLOs that capture user experience. Everything else is metrics, not objectives.

Mistake 4: SLOs Without Error Budgets

SLOs without error budgets are just numbers. There's no framework for decisions.

Fix: Calculate error budgets. Use them to balance reliability and velocity.

Mistake 5: Ignoring the User

SLOs based on internal metrics (CPU, memory) miss the point.

Fix: Base SLOs on what users experience—availability, latency, correctness.

Practical Implementation

Step 1: Choose Your SLIs

Start with availability and latency. Add more only if needed.

Step 2: Measure Baseline

What are your current SLI values? You need this before setting targets.

Step 3: Set SLOs

Based on:

Current performance
User expectations
Business requirements

Step 4: Calculate Error Budgets

Error budget = (1 - SLO) × time period

For 99.9% availability over 30 days:

Budget = 0.1% × 43,200 minutes = 43.2 minutes

Step 5: Define SLAs (If Needed)

Only if you have external customers. Set lower than SLOs.

Step 6: Monitor and Alert

Dashboard showing SLI values
Alerts when approaching SLO thresholds
Error budget burn rate tracking

Summary

SLI (Service Level Indicator): A measurement of service behaviour.

Example: "Availability is 99.95%"

SLO (Service Level Objective): An internal target for an SLI.

Example: "Availability should be ≥ 99.9%"

SLA (Service Level Agreement): An external promise with consequences.

Example: "If availability < 99.5%, customer gets credit"

The hierarchy:

SLIs measure what matters
SLOs set targets for measurements
SLAs make promises based on those targets

Start with SLIs, set realistic SLOs, and only make SLA promises you can keep.

About the Author

Frequently Asked Questions

What is an SLI?

What is an SLO?

What is an SLA?

What's the relationship between SLA, SLO, and SLI?

Related Articles

What Does '99.9% Uptime' Actually Mean in Real Life?

What Is Mean Time to Detect (MTTD)?

What Is Mean Time to Resolve (MTTR)?

Ready to monitor your uptime?