The Palos Publishing Company

Follow Us On The X Platform @PalosPublishing
Categories We Write About

Monitor server health with Python

Monitoring server health with Python is an effective way to ensure your infrastructure runs smoothly and proactively detect issues before they escalate. By leveraging Python’s versatile libraries and scripting capabilities, you can track key performance indicators such as CPU usage, memory consumption, disk space, network activity, and process status. This article outlines practical approaches to build a server health monitoring system using Python, along with example code snippets to get started.

Key Metrics to Monitor

  1. CPU Usage: High CPU utilization can indicate processes consuming excessive resources or potential overload.

  2. Memory Usage: Monitoring RAM ensures applications have enough memory and detects leaks or spikes.

  3. Disk Space: Prevents failures caused by full disks that can halt services or corrupt data.

  4. Network Traffic: Monitors bandwidth and detects abnormal spikes which might indicate issues or attacks.

  5. Process Monitoring: Ensures critical services are running and restarts them if necessary.

  6. System Load: Measures average system load to understand overall server stress.


Essential Python Libraries for Server Monitoring

  • psutil: Cross-platform library for retrieving system information such as CPU, memory, disk, and network.

  • subprocess: To execute system commands when needed.

  • socket: For network-related health checks.

  • smtplib/email: To send alert notifications via email.

  • logging: To keep track of events and errors.

  • time/schedule: To run monitoring scripts at regular intervals.


Setting Up a Basic Server Health Monitor

Install the psutil library if not already installed:

bash
pip install psutil

Monitoring CPU and Memory Usage

python
import psutil def get_cpu_usage(): return psutil.cpu_percent(interval=1) def get_memory_usage(): mem = psutil.virtual_memory() return mem.percent cpu = get_cpu_usage() memory = get_memory_usage() print(f"CPU Usage: {cpu}%") print(f"Memory Usage: {memory}%")

Checking Disk Usage

python
def get_disk_usage(partition="/"): disk = psutil.disk_usage(partition) return disk.percent disk_usage = get_disk_usage() print(f"Disk Usage: {disk_usage}%")

Monitoring Network Usage

python
def get_network_stats(): net_io = psutil.net_io_counters() return net_io.bytes_sent, net_io.bytes_recv sent, recv = get_network_stats() print(f"Bytes Sent: {sent}, Bytes Received: {recv}")

Monitoring Specific Processes

To check if a critical process (e.g., nginx) is running:

python
def check_process(name): for proc in psutil.process_iter(['name']): if proc.info['name'] == name: return True return False process_name = "nginx" if check_process(process_name): print(f"{process_name} is running") else: print(f"{process_name} is NOT running")

Automating Alerts

You can configure email alerts when certain thresholds are exceeded.

Example: Send alert if CPU usage exceeds 80%.

python
import smtplib from email.mime.text import MIMEText def send_alert(subject, message, to_email): from_email = "your_email@example.com" from_password = "your_password" msg = MIMEText(message) msg['Subject'] = subject msg['From'] = from_email msg['To'] = to_email server = smtplib.SMTP_SSL('smtp.gmail.com', 465) server.login(from_email, from_password) server.sendmail(from_email, to_email, msg.as_string()) server.quit() cpu_threshold = 80 cpu = get_cpu_usage() if cpu > cpu_threshold: send_alert("High CPU Usage Alert", f"CPU usage has reached {cpu}%", "admin@example.com")

Note: For Gmail, you might need to create an app password or enable less secure apps.


Scheduling Periodic Checks

Use the schedule library to automate monitoring at intervals.

bash
pip install schedule

Example script to run checks every 5 minutes:

python
import schedule import time def monitor(): cpu = get_cpu_usage() memory = get_memory_usage() disk = get_disk_usage() print(f"CPU: {cpu}%, Memory: {memory}%, Disk: {disk}%") schedule.every(5).minutes.do(monitor) while True: schedule.run_pending() time.sleep(1)

Building a Dashboard or Logging

For more advanced monitoring, you can log data to a file or database, or visualize metrics using tools like:

  • Grafana + Prometheus (Python scripts export data)

  • Flask/Django web dashboards

  • Matplotlib/Plotly for local visualization


Conclusion

Python offers a flexible and powerful toolkit for monitoring server health through simple scripts that check vital metrics, alert on anomalies, and help maintain uptime. By customizing and extending these basic examples, you can create a robust monitoring solution tailored to your infrastructure’s needs. Regular monitoring improves reliability, performance, and incident response, making it an essential practice for system administrators and DevOps teams.

Share this Page your favorite way: Click any app below to share.

Enter your email below to join The Palos Publishing Company Email List

We respect your email privacy

Categories We Write About