Observability¶

Metrics, logs, dashboards, alerting, and uptime monitoring.

All services run on the k3s-cluster (monitoring namespace).

Grafana¶

k3s-cluster · monitoring

Dashboards and visualisation. Connects to Prometheus, Loki, and InfluxDB to provide unified monitoring views across the whole stack.

Documentation · values.sops.yaml

Prometheus Stack¶

k3s-cluster · monitoring

Metrics collection and alerting. Deployed via kube-prometheus-stack, includes Prometheus, Alertmanager, kube-state-metrics, and node exporters.

Documentation · values.sops.yaml

Loki¶

k3s-cluster · monitoring

Log aggregation. Collects and indexes logs from all pods and ships them to Grafana for querying with LogQL.

Documentation · values.yaml

Alloy¶

k3s-cluster · monitoring

OpenTelemetry collector and telemetry pipeline. Scrapes metrics, tails logs, and forwards traces to the appropriate backends (Prometheus, Loki).

Documentation · values.yaml

Uptime Kuma¶

k3s-cluster · monitoring

Service uptime monitoring. Checks HTTP endpoints, TCP ports, and DNS records at configurable intervals and sends alerts on downtime.

Documentation · values.yaml

ntfy¶

k3s-cluster · monitoring

Self-hosted push notification server. Receives alerts from Prometheus Alertmanager, Uptime Kuma, and PagerDuty and delivers them to mobile devices.

Documentation · values.sops.yaml

Speedtest Exporter¶

k3s-cluster · monitoring

Telegraf instance with an exec input that runs periodic internet speed tests and exposes results as Prometheus metrics for Grafana dashboards.

values.yaml

Unpoller¶

k3s-cluster · monitoring

Exports UniFi controller metrics (clients, traffic, device health) to Prometheus for visualisation in Grafana.

Documentation · values.yaml

Cluster Heartbeat¶

k3s-cluster · monitoring

Custom heartbeat job that sends a periodic ping to healthchecks.io. If the ping stops, healthchecks.io fires a PagerDuty alert — confirming the cluster is alive end-to-end.

OctoTrack¶

k3s-cluster · monitoring

Octopus Energy electricity usage dashboard. Pulls consumption data from the Octopus Energy API and displays it in Grafana.

Source · values.yaml