Back to Blog
Guides
uptime monitoring
server monitoring

Uptime Monitoring for Servers: External Checks + Internal Metrics

Learn how to effectively monitor server uptime by combining external availability checks with internal server metrics. The complete approach for server reliability.

WT

Wakestack Team

Engineering Team

6 min read

Who This Is For

This guide is for system administrators, DevOps engineers, and developers who manage servers (VPS, cloud VMs, bare metal) and need to ensure they stay online. Whether you run a single droplet or a fleet of EC2 instances, this guide covers effective server uptime monitoring.

What Is Server Uptime Monitoring?

Server uptime monitoring ensures your servers remain accessible and healthy. It has two components:

External Monitoring (Is It Reachable?)

Checks from outside the server:

  • HTTP/HTTPS: Web server responding?
  • TCP: Service port open?
  • Ping (ICMP): Server reachable?
  • DNS: Domain resolving?

Internal Monitoring (Is It Healthy?)

Checks from inside the server:

  • CPU: Processing capacity available?
  • Memory: RAM not exhausted?
  • Disk: Storage space remaining?
  • Processes: Required services running?

You need both. A server can be reachable but unhealthy (high CPU). A server can be healthy but unreachable (network issue).

The Server Uptime Monitoring Stack

Layer 1: Network Reachability

Ping monitoring verifies basic connectivity:

Monitoring server → ICMP ping → Your server
Result: Server is reachable on the network

Catches: Network outages, routing issues, server crashes

Layer 2: Service Availability

TCP monitoring verifies services are listening:

Monitoring server → TCP connect port 443 → Your server
Result: HTTPS service is accepting connections

Catches: Service crashes, firewall misconfigurations, bind failures

Layer 3: Application Health

HTTP monitoring verifies applications work:

Monitoring server → GET /api/health → Your server
Response: 200 OK, {"status": "healthy"}

Catches: Application bugs, dependency failures, configuration errors

Layer 4: Resource Health

Agent monitoring tracks server resources:

Agent on server → Collect CPU/Memory/Disk → Send to monitoring
Data: CPU 45%, Memory 72%, Disk 60%

Catches: Resource exhaustion before it causes outages

Wakestack's Server Monitoring Approach

ComponentWhat It Does
Uptime checksHTTP, TCP, DNS, Ping from multiple regions
Server agentCPU, Memory, Disk, Process metrics
Nested hostsGroup monitors by server
Status pagesCommunicate status to users

Why This Combination Matters

Scenario: Your API goes down at 2 AM

With uptime-only monitoring:

Alert: api.example.com is down
Action: SSH in, investigate
Time to diagnose: 10+ minutes

With Wakestack's approach:

Alert: api.example.com is down
Dashboard: Server CPU at 98%, runaway process
Action: Kill process, investigate
Time to diagnose: 2 minutes

Setting Up Server Uptime Monitoring

Step 1: Add External Monitors

Create checks for your server's endpoints:

HTTP Monitor:
  URL: https://yourserver.com
  Interval: 1 minute
  Expected: 200 OK

TCP Monitor:
  Host: yourserver.com
  Port: 22 (SSH)
  Interval: 5 minutes

Ping Monitor:
  Host: yourserver.com
  Interval: 1 minute

Step 2: Install Server Agent

Deploy the Wakestack agent on your server:

# Download and install
curl -sSL https://wakestack.co.uk/install.sh | bash
 
# Verify running
systemctl status wakestack-agent
 
# Check logs
journalctl -u wakestack-agent -f

In Wakestack dashboard:

  1. Create a host for your server
  2. Edit each monitor
  3. Set parent host to your server

Step 4: Configure Alert Thresholds

Uptime Alerts:
  - Failures before alert: 2 consecutive
  - Alert channels: Slack, Email
 
Server Alerts:
  - CPU warning: > 80%
  - CPU critical: > 95%
  - Memory warning: > 85%
  - Memory critical: > 95%
  - Disk warning: > 80%
  - Disk critical: > 90%

Step 5: Test Everything

  1. Trigger a test alert (intentionally fail a check)
  2. Verify notifications reach you
  3. Verify server metrics are flowing

What to Monitor on Each Server Type

Web Servers (nginx, Apache)

CheckTypeWhy
HomepageHTTPUser experience
Health endpointHTTPApplication status
Port 443TCPSSL termination
Port 80TCPHTTP redirect
CPU/MemoryAgentResource health

Application Servers (Node, Python, Java)

CheckTypeWhy
/api/healthHTTPApplication up
/api/readyHTTPDependencies ready
Application portTCPService binding
Process existsAgentApp running
CPU/MemoryAgentResource health

Database Servers (PostgreSQL, MySQL)

CheckTypeWhy
Database portTCPAccepting connections
Query endpointHTTPIf exposed via API
CPUAgentQuery performance
MemoryAgentBuffer cache
DiskAgentData storage
Disk I/OAgentQuery latency

Cache Servers (Redis, Memcached)

CheckTypeWhy
Cache portTCPAccepting connections
MemoryAgentCache capacity
CPUAgentOperation speed
ProcessAgentCache running

Server Monitoring Best Practices

1. Monitor from Multiple Locations

A server might be reachable from one region but not another. Use at least 3 geographic regions:

  • US East
  • US West or Europe
  • Asia Pacific

2. Set Appropriate Check Intervals

Server TypeRecommended Interval
Production web30-60 seconds
Production API30-60 seconds
Production DB1-2 minutes
Staging/Dev5 minutes
Internal tools5-10 minutes

3. Use Health Check Endpoints

Don't just check if the port is open. Create endpoints that verify:

// /api/health
{
  "status": "healthy",
  "database": "connected",
  "cache": "connected",
  "queue": "connected"
}

Brief CPU spikes are normal. Set alerts for sustained issues:

Alert if: CPU > 85% for 5+ minutes
Not: CPU > 85% once

5. Set Up Disk Growth Alerts

Don't wait for 90% full. Track growth rate:

Warning at: 70% (plan capacity)
Alert at: 80% (schedule expansion)
Critical at: 90% (immediate action)

Common Server Issues and Detection

Issue: Memory Leak

Symptoms:

  • Memory usage slowly climbing
  • Eventually OOM kills or crashes

Detection:

  • Agent monitoring shows memory trend
  • Alert before 95%

Issue: Disk Filling

Symptoms:

  • Logs or data growing unbounded
  • Application errors when full

Detection:

  • Disk monitoring alerts at 80%
  • Time to clean up or expand

Issue: CPU Saturation

Symptoms:

  • Slow responses
  • Request timeouts

Detection:

  • CPU monitoring shows sustained high
  • Process list shows culprit

Issue: Zombie Processes

Symptoms:

  • Resource usage but no work done
  • Gradual performance degradation

Detection:

  • Process monitoring shows unexpected processes
  • CPU usage without request correlation

Try Wakestack for Server Monitoring

Monitor your servers with external checks and internal metrics.

  • Uptime monitoring from multiple regions
  • Server agent for CPU, memory, disk
  • Nested organization for clarity
  • Free tier to get started

Monitor Your Servers →

About the Author

WT

Wakestack Team

Engineering Team

Frequently Asked Questions

How do I monitor server uptime?

Monitor server uptime with two approaches: external checks (HTTP, TCP, ping) to verify accessibility, and internal agent monitoring (CPU, memory, disk) to understand server health.

What's the best uptime monitoring for VPS?

For VPS, use a tool that combines external endpoint checks with an installed agent for server metrics. Wakestack does both in one platform.

Should I use ping or HTTP to check server uptime?

Use HTTP checks for web servers (tests the full application stack). Use ping/TCP for non-HTTP services. For complete visibility, use both alongside server metrics.

Related Articles

Ready to monitor your uptime?

Start monitoring your websites, APIs, and services in minutes. Free forever for small projects.