How to Set Up Grafana and Prometheus for Homelab Monitoring

I need to be honest with you before we start. Grafana and Prometheus will teach you more about your infrastructure than any other tool you deploy. They will also consume an alarming number of hours as you chase the perfect dashboard layout, tweak colour schemes, and add panels for metrics you will never actually look at. You have been warned. Now let me show you how to set it up properly.

From the field: This is the monitoring stack I rely on daily. Prometheus scrapes metrics from every host and service I run, and Grafana makes it visual. When something goes wrong at 2am, the dashboards tell me exactly where to look. It is the same stack used in production at companies far larger than my homelab.

Why Metrics Monitoring (When You Already Have Uptime Kuma)

If you followed my Uptime Kuma guide, you already have availability monitoring. Uptime Kuma answers one question: is this service up or down? That is essential, but it is not the whole picture.

Prometheus and Grafana answer different questions:

Why is this service slow right now?
Is my disk going to fill up in three days?
Which container is eating all the CPU?
Did that kernel update change my memory usage pattern?
Is my network throughput degrading over time?

Uptime Kuma tells you something is down. Prometheus and Grafana tell you why it went down, and ideally, warn you before it happens. They are complementary, not competing. Run both.

The architecture is simple: Prometheus scrapes metrics from your systems at regular intervals and stores them as time-series data. Grafana queries Prometheus and turns the data into dashboards and alerts. Node Exporter runs on each host and exposes system metrics (CPU, RAM, disk, network) in a format Prometheus understands.

Career Context: Prometheus and Grafana are the industry standard for infrastructure monitoring. They are in production at companies of every size, from startups to FAANG. If you are interviewing for any infrastructure role — SRE, DevOps, Platform Engineering, Cloud Engineer — “experience with Prometheus and Grafana” is on nearly every job description. Running them in your homelab gives you genuine, hands-on experience with the same tools enterprise teams use daily. This is not a homelab toy. This is a career skill.

Prerequisites

You will need:

Docker and Docker Compose installed on your server. See the Docker installation guide if you have not done this yet.
A machine with at least 2 GB of free RAM. Prometheus is not lightweight when it is storing weeks of metrics. 1 GB for Prometheus, 512 MB for Grafana, and a bit for Node Exporter. On a dedicated monitoring VM, 4 GB is comfortable.
Some disk space. Prometheus stores metrics on disk. How much depends on retention period and scrape targets. For a small homelab (5-10 targets, 15-day retention), 10 GB is plenty. I will show you how to configure retention later.

Do not run this on the same Raspberry Pi as everything else. I have seen people try to run Prometheus on a Pi 3 alongside ten other containers and wonder why everything grinds to a halt. Prometheus does real work — it scrapes, stores, and indexes time-series data. A Pi 4 with 4 GB or a Pi 5 can handle it. A Pi 3 will struggle. A mini PC or VM is the better choice.

Step 1: Create the Docker Compose Stack

Create a directory for the entire monitoring stack:

mkdir -p ~/monitoring/{prometheus,grafana}
cd ~/monitoring

Create the docker-compose.yml:

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention.time=30d'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    restart: unless-stopped
    ports:
      - "3000:3000"
    volumes:
      - grafana_data:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_ADMIN_PASSWORD}
      - GF_USERS_ALLOW_SIGN_UP=false
    depends_on:
      - prometheus

  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    restart: unless-stopped
    ports:
      - "9100:9100"
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.rootfs=/rootfs'
      - '--path.sysfs=/host/sys'
      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'

volumes:
  prometheus_data:
  grafana_data:

Create a .env file for the Grafana admin password:

GRAFANA_ADMIN_PASSWORD=your-strong-password-here

A few things worth noting in this compose file. The --storage.tsdb.retention.time=30d flag tells Prometheus to keep 30 days of metrics. Adjust this based on your disk space and needs. Node Exporter mounts /proc, /sys, and / as read-only so it can read system metrics without any write access. Grafana has signup disabled because this is your homelab, not a public service.

Step 2: Configure Prometheus

Create the Prometheus configuration file at prometheus/prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['node-exporter:9100']
        labels:
          instance: 'monitoring-host'

  # Add more hosts here as you deploy Node Exporter on them
  # - job_name: 'plato'
  #   static_configs:
  #     - targets: ['192.168.1.32:9100']
  #       labels:
  #         instance: 'plato'

The scrape_interval of 15 seconds is the default and works well for homelabs. Going lower generates more data and uses more storage. Going higher means your dashboards show coarser resolution. 15 seconds is the sweet spot unless you have a specific reason to change it.

Pro tip: Notice the labels section. Adding an instance label with a human-readable name makes your Grafana dashboards much more useful than seeing raw IP addresses everywhere. You will thank yourself for this when you have ten hosts being scraped and they all show as 192.168.1.x:9100 otherwise.

Step 3: Start the Stack

docker compose up -d

Verify everything is running:

docker compose ps

You should see three containers running: prometheus, grafana, and node-exporter. Give them 30 seconds to initialise, then check Prometheus is scraping correctly by visiting http://your-server-ip:9090/targets. You should see two targets (prometheus and node-exporter) both showing as “UP” in green.

If either shows as “DOWN,” check the container logs:

docker logs prometheus
docker logs node-exporter

Step 4: Connect Grafana to Prometheus

Open Grafana at http://your-server-ip:3000 and log in with the admin credentials from your .env file.

Go to Connections > Data sources (or the gear icon, then Data Sources in older versions)
Click Add data source and select Prometheus
Set the URL to http://prometheus:9090 (Docker’s internal DNS resolves container names)
Leave everything else at defaults and click Save & test

You should see “Successfully queried the Prometheus API.” If you get a connection refused error, make sure you used the container name (prometheus) not localhost. Grafana and Prometheus are in the same Docker network, so they communicate by container name.

Step 5: Import Your First Dashboard

This is where the dopamine hits. Do not build a dashboard from scratch on your first attempt. Import the community classic and learn from its structure.

Go to Dashboards > New > Import
Enter dashboard ID 1860 and click Load
This is “Node Exporter Full” — the most popular Grafana dashboard for Linux host monitoring
Select your Prometheus data source from the dropdown
Click Import

You will immediately see panels for CPU usage, memory, disk I/O, network traffic, filesystem usage, system load, and dozens of other metrics. All of this data was already being exposed by Node Exporter and scraped by Prometheus. You just gave it a face.

Other dashboards worth importing:

ID 893 — Docker container monitoring (if you add cAdvisor, covered below)
ID 3662 — Prometheus 2.0 Stats (monitor your monitoring)
ID 14282 — cAdvisor Exporter (clean container metrics view)

The dashboard addiction warning. I promised I would mention this, so here it is. You will spend the next two hours importing dashboards, tweaking panel sizes, changing colour thresholds, and adding gauges for metrics you have never cared about and never will. This is normal. Everyone does it. Get it out of your system, then focus on the panels that actually help you operate your infrastructure. If you are spending more time perfecting dashboards than fixing the problems they reveal, step away from the JSON model editor.

Step 6: Monitor Your Docker Containers

Node Exporter monitors the host. To monitor individual Docker containers, add cAdvisor (Container Advisor) to your stack. Add this to your docker-compose.yml:

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    restart: unless-stopped
    ports:
      - "8081:8080"
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
      - /dev/disk/:/dev/disk:ro
    privileged: true
    devices:
      - /dev/kmsg:/dev/kmsg

Add a scrape target in prometheus/prometheus.yml:

  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8080']

Restart the stack:

docker compose up -d
# Reload Prometheus config without restarting (preserves existing data)
curl -X POST http://localhost:9090/-/reload

Now import dashboard 893 or 14282 and you will see per-container CPU, memory, network, and disk I/O metrics. This is genuinely useful — when a container starts misbehaving, you can see exactly when the resource usage changed and correlate it with events.

Step 7: Set Up Alerts

Dashboards are only useful when you are looking at them. Alerts are useful all the time. Grafana has a built-in alerting system that can notify you via email, Telegram, Discord, Slack, and many other channels.

Here are the alerts I recommend starting with:

Disk Usage Alert

Go to Alerting > Alert rules > New alert rule. Set up a Prometheus query:

(node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 15

This fires when your root filesystem has less than 15% free space. Adjust the mountpoint and threshold as needed. I have been bitten by full disks more than any other homelab failure. Docker images, log files, and Prometheus data itself will fill a disk faster than you expect.

High Memory Usage

(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 90

Fires when available memory drops below 10%. Useful for catching memory leaks in containers that gradually consume everything until the OOM killer starts making decisions you will not like.

Host Down

up{job="node-exporter"} == 0

Fires when Prometheus cannot scrape a target. Simple but critical. If Node Exporter stops responding, the host is either down or the exporter crashed.

For notifications, configure a contact point under Alerting > Contact points. Telegram and Discord work well for homelab alerts. The setup is nearly identical to what I described in the Uptime Kuma guide.

Pro tip: Set a for duration on your alerts (e.g., 5 minutes). This means the condition must be true for five continuous minutes before the alert fires. Without this, you will get alerts for momentary spikes that resolve on their own, and you will start ignoring alerts. Alert fatigue is real and dangerous. Fewer, meaningful alerts beat many noisy ones.

Monitoring Additional Hosts

The real power comes when you monitor your entire homelab, not just the machine running the stack. For each additional host:

Deploy Node Exporter on that host (either as a Docker container or a system package)
Add its IP to your Prometheus config under scrape_configs
Reload Prometheus

On a remote host, the quickest way to deploy Node Exporter:

docker run -d \
  --name node-exporter \
  --restart=unless-stopped \
  --net=host \
  --pid=host \
  -v /:/host:ro,rslave \
  prom/node-exporter:latest \
  --path.rootfs=/host

Using --net=host is important here because Node Exporter needs to see the actual network interfaces, not Docker's virtual networking. The metrics would be misleading otherwise.

Then add to your prometheus.yml:

  - job_name: 'plato'
    static_configs:
      - targets: ['192.168.1.32:9100']
        labels:
          instance: 'plato'

Reload Prometheus and the new host appears in your dashboards within seconds.

Troubleshooting

Grafana Shows "No Data"

First, verify the Prometheus data source is working (Data Sources > Prometheus > Save & test). If that passes, check if Prometheus has the data. Go to http://your-server-ip:9090 and run a simple query like up. If Prometheus has no data, check that your scrape targets are UP at /targets. If the targets show errors, Node Exporter is either not running or not reachable from the Prometheus container.

Prometheus Uses Too Much Disk

Reduce retention time with --storage.tsdb.retention.time=15d (or lower). You can also set a size-based limit: --storage.tsdb.retention.size=5GB. Check current disk usage at the Prometheus UI under Status > TSDB Status. If you have many high-cardinality metrics, consider increasing your scrape interval or dropping metrics you do not use with metric_relabel_configs.

Node Exporter Shows Wrong Disk Metrics

The --collector.filesystem.mount-points-exclude flag in the compose file filters out virtual filesystems. Without it, you see dozens of entries for tmpfs, overlay, and other Docker-internal mounts that are not real disks. If you are still seeing unwanted mounts, adjust the regex pattern. The common pattern ^/(sys|proc|dev|host|etc)($$|/) covers most cases.

Cannot Access Grafana After Restart

If you changed GF_SECURITY_ADMIN_PASSWORD in the env file and recreated the container, the password change is only applied on first run. Grafana persists its database in the volume. To reset: docker exec -it grafana grafana-cli admin reset-admin-password newpassword.

Prometheus Cannot Scrape Remote Hosts

Firewall rules. Port 9100 needs to be open on the target host for the Prometheus server to reach it. On Ubuntu: sudo ufw allow from 192.168.1.0/24 to any port 9100. Restrict the source to your monitoring server or LAN subnet -- there is no reason to expose metrics to the internet.

What to Monitor Next

You have system metrics sorted. Here is where to expand:

Uptime Kuma -- if you have not set it up yet, pair it with Grafana. Uptime Kuma handles availability checks and status pages, Grafana handles deep metrics. They complement each other perfectly.
Build Your First Homelab -- now that you can monitor everything, this guide helps you plan what to run and where
Install Docker on Ubuntu 24.04 -- the foundation for everything in this stack, if you jumped straight to monitoring
Linux Commands That Get You Hired -- the terminal skills you need when the dashboard shows a problem and you need to investigate on the host
Jellyfin Media Server -- once you are monitoring your infrastructure, start running services on it. Jellyfin is a great next step for media sovereignty.
Tailscale VPN -- access your Grafana dashboards from anywhere without exposing them to the public internet

One last piece of advice: resist the urge to monitor every metric available. Node Exporter exposes hundreds of metrics. Most of them you will never need. Focus on the four golden signals -- latency, traffic, errors, and saturation. If you can tell at a glance whether your systems are running out of CPU, memory, disk, or network capacity, you have good monitoring. Everything else is refinement.

Watch Out For This

Start with a small number of dashboards and expand. I have seen people import 50 community dashboards on day one and then never look at any of them. Build dashboards for what you actually need to monitor.

Key Takeaways

Prometheus scrapes and stores metrics, Grafana visualises them, Node Exporter exposes system metrics. Together they form the industry-standard monitoring stack.
Start with dashboard ID 1860 (Node Exporter Full) rather than building from scratch. Learn from its structure, then customise.
Add cAdvisor to monitor individual Docker containers. Knowing which container is consuming resources is essential when troubleshooting.
Set up alerts for disk usage, memory, and host availability at minimum. Dashboards only help when you are looking at them. Alerts help all the time.
Use the for duration on alerts to avoid fatigue from transient spikes. Five minutes is a sensible starting point.
Deploy Node Exporter on every host you want to monitor. Add the target to Prometheus, reload, and the metrics appear in your dashboards within seconds.
This is not a homelab toy. Prometheus and Grafana are production tools used by enterprises worldwide. The experience translates directly to infrastructure career roles.

Related Guides

If you found this useful, these guides continue the journey:

How to Install Uptime Kuma to Monitor Your Homelab -- availability monitoring that complements Grafana's metrics
Docker Compose for Beginners -- the foundation for deploying Grafana, Prometheus, and exporters
How to Build Your First Homelab in 2026 -- the complete guide to building your lab environment
How to Install Proxmox VE -- virtualisation platform to host your monitoring stack
Linux Fundamentals -- the OS skills underpinning everything in this guide

The RTM Essential Stack - Gear I Actually Use

ReadTheManualTech

ReadTheManual is run, written and curated by Eric Lonsdale.

Eric has over 20 years of professional experience in IT infrastructure, cloud architecture, and cybersecurity, but started with PCs long before that.

He built his first machine from parts bought off tables at the local college campus, hoping they worked. He learned on BBC Micros and Atari units in the early 90s, and has built almost every PC he’s used between 1995 and now.

From helpdesk to infrastructure architect, Eric has worked across enterprise datacentres, Azure environments, and security operations. He’s managed teams, trained engineers, and spent two decades solving the problems this site teaches you to solve.

ReadTheManual exists because Eric believes the best way to learn IT is to build things, break things, and actually read the manual. Every guide on this site runs on infrastructure he owns and maintains.

Enjoyed this guide?

New articles on Linux, homelab, cloud, and automation every 2 days. No spam, unsubscribe anytime.