Uptime Monitoring for Servers: External Checks + Internal Metrics

Who This Is For

This guide is for system administrators, DevOps engineers, and developers who manage servers (VPS, cloud VMs, bare metal) and need to ensure they stay online. Whether you run a single droplet or a fleet of EC2 instances, this guide covers effective server uptime monitoring.

What Is Server Uptime Monitoring?

Server uptime monitoring ensures your servers remain accessible and healthy. It has two components:

External Monitoring (Is It Reachable?)

Checks from outside the server:

HTTP/HTTPS: Web server responding?
TCP: Service port open?
Ping (ICMP): Server reachable?
DNS: Domain resolving?

Internal Monitoring (Is It Healthy?)

Checks from inside the server:

CPU: Processing capacity available?
Memory: RAM not exhausted?
Disk: Storage space remaining?
Processes: Required services running?

You need both. A server can be reachable but unhealthy (high CPU). A server can be healthy but unreachable (network issue).

The Server Uptime Monitoring Stack

Layer 1: Network Reachability

Ping monitoring verifies basic connectivity:

Monitoring server → ICMP ping → Your server
Result: Server is reachable on the network

Catches: Network outages, routing issues, server crashes

Layer 2: Service Availability

TCP monitoring verifies services are listening:

Monitoring server → TCP connect port 443 → Your server
Result: HTTPS service is accepting connections

Catches: Service crashes, firewall misconfigurations, bind failures

Layer 3: Application Health

HTTP monitoring verifies applications work:

Monitoring server → GET /api/health → Your server
Response: 200 OK, {"status": "healthy"}

Catches: Application bugs, dependency failures, configuration errors

Layer 4: Resource Health

Agent monitoring tracks server resources:

Agent on server → Collect CPU/Memory/Disk → Send to monitoring
Data: CPU 45%, Memory 72%, Disk 60%

Catches: Resource exhaustion before it causes outages

Wakestack's Server Monitoring Approach

Component	What It Does
Uptime checks	HTTP, TCP, DNS, Ping from multiple regions
Server agent	CPU, Memory, Disk, Process metrics
Nested hosts	Group monitors by server
Status pages	Communicate status to users

Why This Combination Matters

Scenario: Your API goes down at 2 AM

With uptime-only monitoring:

Alert: api.example.com is down
Action: SSH in, investigate
Time to diagnose: 10+ minutes

With Wakestack's approach:

Alert: api.example.com is down
Dashboard: Server CPU at 98%, runaway process
Action: Kill process, investigate
Time to diagnose: 2 minutes

Setting Up Server Uptime Monitoring

Step 1: Add External Monitors

Create checks for your server's endpoints:

HTTP Monitor:
  URL: https://yourserver.com
  Interval: 1 minute
  Expected: 200 OK

TCP Monitor:
  Host: yourserver.com
  Port: 22 (SSH)
  Interval: 5 minutes

Ping Monitor:
  Host: yourserver.com
  Interval: 1 minute

Step 2: Install Server Agent

Deploy the Wakestack agent on your server:

# Download and install
curl -sSL https://wakestack.co.uk/install.sh | bash
 
# Verify running
systemctl status wakestack-agent
 
# Check logs
journalctl -u wakestack-agent -f

Step 3: Link Monitors to Host

In Wakestack dashboard:

Create a host for your server
Edit each monitor
Set parent host to your server

Step 4: Configure Alert Thresholds

Uptime Alerts:
  - Failures before alert: 2 consecutive
  - Alert channels: Slack, Email
 
Server Alerts:
  - CPU warning: > 80%
  - CPU critical: > 95%
  - Memory warning: > 85%
  - Memory critical: > 95%
  - Disk warning: > 80%
  - Disk critical: > 90%

Step 5: Test Everything

Trigger a test alert (intentionally fail a check)
Verify notifications reach you
Verify server metrics are flowing

What to Monitor on Each Server Type

Web Servers (nginx, Apache)

Check	Type	Why
Homepage	HTTP	User experience
Health endpoint	HTTP	Application status
Port 443	TCP	SSL termination
Port 80	TCP	HTTP redirect
CPU/Memory	Agent	Resource health

Application Servers (Node, Python, Java)

Check	Type	Why
/api/health	HTTP	Application up
/api/ready	HTTP	Dependencies ready
Application port	TCP	Service binding
Process exists	Agent	App running
CPU/Memory	Agent	Resource health

Database Servers (PostgreSQL, MySQL)

Check	Type	Why
Database port	TCP	Accepting connections
Query endpoint	HTTP	If exposed via API
CPU	Agent	Query performance
Memory	Agent	Buffer cache
Disk	Agent	Data storage
Disk I/O	Agent	Query latency

Cache Servers (Redis, Memcached)

Check	Type	Why
Cache port	TCP	Accepting connections
Memory	Agent	Cache capacity
CPU	Agent	Operation speed
Process	Agent	Cache running

Server Monitoring Best Practices

1. Monitor from Multiple Locations

A server might be reachable from one region but not another. Use at least 3 geographic regions:

US East
US West or Europe
Asia Pacific

2. Set Appropriate Check Intervals

Server Type	Recommended Interval
Production web	30-60 seconds
Production API	30-60 seconds
Production DB	1-2 minutes
Staging/Dev	5 minutes
Internal tools	5-10 minutes

3. Use Health Check Endpoints

Don't just check if the port is open. Create endpoints that verify:

// /api/health
{
  "status": "healthy",
  "database": "connected",
  "cache": "connected",
  "queue": "connected"
}

4. Alert on Trends, Not Spikes

Brief CPU spikes are normal. Set alerts for sustained issues:

Alert if: CPU > 85% for 5+ minutes
Not: CPU > 85% once

5. Set Up Disk Growth Alerts

Don't wait for 90% full. Track growth rate:

Warning at: 70% (plan capacity)
Alert at: 80% (schedule expansion)
Critical at: 90% (immediate action)

Common Server Issues and Detection

Issue: Memory Leak

Symptoms:

Memory usage slowly climbing
Eventually OOM kills or crashes

Detection:

Agent monitoring shows memory trend
Alert before 95%

Issue: Disk Filling

Symptoms:

Logs or data growing unbounded
Application errors when full

Detection:

Disk monitoring alerts at 80%
Time to clean up or expand

Issue: CPU Saturation

Symptoms:

Slow responses
Request timeouts

Detection:

CPU monitoring shows sustained high
Process list shows culprit

Issue: Zombie Processes

Symptoms:

Resource usage but no work done
Gradual performance degradation

Detection:

Process monitoring shows unexpected processes
CPU usage without request correlation

Try Wakestack for Server Monitoring

Monitor your servers with external checks and internal metrics.

Uptime monitoring from multiple regions
Server agent for CPU, memory, disk
Nested organization for clarity
Free tier to get started

Monitor Your Servers →

About the Author

Frequently Asked Questions

How do I monitor server uptime?

What's the best uptime monitoring for VPS?

Should I use ping or HTTP to check server uptime?

Related Articles

Agent-Based Monitoring: Why You Need Eyes Inside Your Servers

Server Monitoring: Complete Guide to Infrastructure Visibility

Uptime Monitoring: The Complete Guide for 2026

Ready to monitor your uptime?