vision
HomeLab on Kubernetes (k3s)
Main objectives
- Create a multi-zone self hosted Kubernetes(k3s) cluster
- Migrate workloads
Design requirements
- GITOPS, ArgoCD, workflows
- use wireguard to connect to wire locations
- domain zone control(let's encrypt)
- persistent storage available for all nodes
- metrics/alerting
- secrets
Goals that define basic success
- A healthy multi-zone Kubernetes cluster(k3s)
- IaC on git(GitOps)
- Secrets store(Vault)
- DNS name issuing/signing(Cert-manager with CloudFlare)
- Wireguard
- Metrics/logs collected with OTLP collector and stored in Victoriametrics
End goal of (un)achievable completion
- Can't think of any
the-stack
HomeLab: the stack
What is running today
The drawing board
NixOS hosts with services.k3s.enable (Heavily stripped) + wireguard setup
The GitOps layout eventually matured to the following:
applications
โโโ applicationset.yaml
โโโ home-automation
โ โโโ esphome
โ โโโ hass
โ โโโ mosquitto
โ โโโ zigbee2mqtt
โโโ mediaserver
โ โโโ jellyfin
โ โโโ transmission
โโโ narbuto
โโโ utils
โ โโโ actual-budget
โโโ vaultwarden
infrastructure
โโโ applicationset.yaml
โโโ controllers
| โโโ vault
โ โโโ argocd
โโโ monitoring
โ โโโ kube-prometheus
โ โโโ metrics-server
โโโ networking
โ โโโ adguard
โ โโโ cert-manager
โ โโโ cilium
โ โโโ cloudflared
โ โโโ gateway
โโโ storage
โโโ openebs
Control plane and delivery model
- Ingress traffic over Cloudflare tunnel
- Repository layout is split into
applications/*andinfrastructure/*. - Deployments are done by ArgoCD with Helmfile plugin. Instead of Helm + Kustomization.(Cilium and ArgoCD included)
Infrastructure
- Networking:
- Cloudflared
- Cilium (+ Hubble)
- Gateway API(Instead of Ingress)
- AdGuard.
- Certificates: cert-manager with Cloudflare DNS-01.
- Storage: OpenEBS with ZFS-based storage classes.
- Observability:
- kube-prometheus stack
- metrics-server
- Platform control:
- ArgoCD
- Hashicorp Vault to render cluster secrets
Routes and domain shape
Gatewayresource is configured with listeners for configured domain.- Services exposed through
HTTPRouteresources.
Application workloads
- Home automation:
- Home Assistant
- Zigbee2MQTT
- Mosquitto
- ESPHome
- Media:
- Jellyfin
- Transmission.
- TODO: Immich
- Utilities/apps:
- Vaultwarden
- Actual budget
- Personal website
the-story
HomeLab: the story
Inspiration/Motivation
Originally, my host and service definitions lived in a Nix monorepo. It was ok the way I was using it. Traefik wired to services like Plex or Home Assistant with never changing configuration. However if I wanted to play with configuration options, new versions, or just drop in a service to try out, this excessive bureaucracy was not very comfortable.
The way I like to work:
- Deploy initial setup
- Fine tune the changes until I see the output I expect
- Catch the drift and commit it in a final version
In the meantime professionally I was doing just that.
And when it was time to replace my 10 year old Intel Nuc, I realized that it's a perfect time to make some changes.
Use nix to keep track of infra. Use GitOps to keep the state of the runtime.
As for the rest, let Kubernetes do it's Eventual Consistency thing.
Shopping List
- free Cloudflare account
- a domain or two transferred to Cloudflare
- a couple of N150 boxes. The prices aren't as they used to be though ๐ญ
The Process / Pr(๐น)ss
Early days
Since I decided to do this, many things have changed. I just wanted to learn more about Cilium eBPF magic and I to be more motivated to complete it, like with most of the things - I made a project out of it.
As I started putting things together I wanted to be able to spread my boxes between a homes. Additionaly I had to consider the problems that arise from my ISP CGNAT limitations.
I wanted ArgoCD to pick up secrets from a secret manager, i.e. self hosted Vaultwarden instance. However after 2 weeks of deep diving into Bitwarden internals and trying to bend to its limitations I managed to get things working. Just to have my friend asking me - why not Vault?
Another big one was how to add an authentication layer like Authelia or Authentik. I had it on my old setup but since Cloudflare already something I decided to keep it simple and just use Cloudflare's.
Up until now
I have migrated all the workloads from the nix expressions to helm charts/values. And oh what a joy it's giving me every day.
Moving Home Assistant was easy. Home Assistant has become a big Pandora's box and I think it's best if you just run it as a docker image and just mount your configurations. Comparing to all the work that had to be done to package it with nix.
Most of the services are stateless and they can move around the nodes. And for the ones that need access to particular disk or peripherals, nodeSelector does the job:
- Jellyfin mounts local storage and has access to hardware decoding.
- Zigbee stack is able to get to its USB coordinator.
Vault contains cluster related secrets. Because the secrets shouldn't live longer than the cluster, no need an external secrets manager.
And every time I need to drop in another service, I just template a helm chart, create a HTTPRoute resource and watch things come to life.
If I am playing with versions or config files, once I'm happy, check ArgoCD Web UI for the drift, commit the changes and bam.
What is there for tomorrow
Even though I have been working with these things professionally for so many years, I was pleasantly suprised how rewarding can be creating your cluster vs terraforming an aws managed one. You're free to make your own choices, adapt to your own use case, go wild doing research and experiment unexplored concepts.
As of what comes next. Probably I will be replacing kube-prometheus-stack with something better suited for non enterprice environment. Like OpenTelemetry + Victoriametrics(logs/traces/vmalert) for metrics compatibility and storage retention.