Observability¶
Metrics, logs, dashboards, alerting, and uptime monitoring.
All services run on the k3s-cluster (monitoring namespace).
Grafana¶
k3s-cluster ·
monitoring
Dashboards and visualisation. Connects to Prometheus, Loki, and InfluxDB to provide unified monitoring views across the whole stack.
Documentation · values.sops.yaml
Prometheus Stack¶
k3s-cluster ·
monitoring
Metrics collection and alerting. Deployed via kube-prometheus-stack, includes Prometheus, Alertmanager, kube-state-metrics, and node exporters.
Documentation · values.sops.yaml
Loki¶
k3s-cluster ·
monitoring
Log aggregation. Collects and indexes logs from all pods and ships them to Grafana for querying with LogQL.
Alloy¶
k3s-cluster ·
monitoring
OpenTelemetry collector and telemetry pipeline. Scrapes metrics, tails logs, and forwards traces to the appropriate backends (Prometheus, Loki).
Uptime Kuma¶
k3s-cluster ·
monitoring
Service uptime monitoring. Checks HTTP endpoints, TCP ports, and DNS records at configurable intervals and sends alerts on downtime.
ntfy¶
k3s-cluster ·
monitoring
Self-hosted push notification server. Receives alerts from Prometheus Alertmanager, Uptime Kuma, and PagerDuty and delivers them to mobile devices.
Documentation · values.sops.yaml
Speedtest Exporter¶
k3s-cluster ·
monitoring
Telegraf instance with an exec input that runs periodic internet speed tests and exposes results as Prometheus metrics for Grafana dashboards.
Unpoller¶
k3s-cluster ·
monitoring
Exports UniFi controller metrics (clients, traffic, device health) to Prometheus for visualisation in Grafana.
Cluster Heartbeat¶
k3s-cluster ·
monitoring
Custom heartbeat job that sends a periodic ping to healthchecks.io. If the ping stops, healthchecks.io fires a PagerDuty alert — confirming the cluster is alive end-to-end.
OctoTrack¶
k3s-cluster ·
monitoring
Octopus Energy electricity usage dashboard. Pulls consumption data from the Octopus Energy API and displays it in Grafana.