My homelab evolution

One of my hobbies is managing my homelab environment. I run numerous services for entertainment as well as organizational purposes.

My homelab setup went through multiple stages.

Years ago I was managin a 3 node cluster made of office thin clients, running full Kubernetes distribution provisioned by Kubespray - an Ansible playbook to provision a fully functioning Kubernetes cluster.

Later, having tried out NixOS again after years of not using it I decided to replace my Kubernetes cluster with a single, powerful machine - AMD Ryzen 5950x with 64GB of RAM, with 16TB of usable storage on spinning rust, managed by ZFS, as well as an array of flash storage for VMs.

Service containers were managed declaratively and I had multiple VMs and my laptop all configured through the central NixOS repository.

This was a typical service declaration (of which I had a dozen or so):

virtualisation.oci-containers.containers = {
  myservice = {
    image = "docker.io/image:version";
    volumes = [...];
    ports = [...];
};

Volumes backup was managed by restic, backing up to encrypted data to a remote storage location. This worked well long enough. However, I had always enjoyed running my services on Kubernetes, in addition I managed Kubernetes clusters at work, so I had more experience then. The idea of having all privimites out of the box in a clustered environment seemed pretty cool.

Kubernetes ecosystem is mature enough to allow you to build the exact kind of environment you would need. In addition to that, I missed using Longhorn for volumes management with its ability to backup and restore volumes through ui, as well as having Grafana and Prometheus for monitoring and Alertmanager for ensuring my services are up and running. Having Kubernetes cluster also meant I could add and remove worker nodes out of operation for maintenance and replacement.

I replaced the single machine with 2 mini pc’s based on Intel N100 processors and a NAS server with TrueNAS, also running a virtual machine acting as the 3rd node. These mini pc’s are amazing. They take little to no power. Mine average around 10 watts per node! With 16 Gbytes of RAM and 4 CPUs you’ve got some serious computing capacity to run self hosted services at home, as well as a little extra to experiment with - all at under average 50 watts of power usage (bursting will take it up to 15-25w, plus they need a switch to connect them together). They are cheap and ubiquitous, their reliability is yet to be determined.

I was able to utilise my existing NixOS configuration to provision the mini pc’s with NixOS and join them in the cluster. With NixOS I can configure a node and add it to the cluster:

{
  config,
  lib,
  pkgs,
  ...
}:

{
  services.k3s.enable = true;
  services.k3s.role = "server";
  services.k3s.package = pkgs.k3s_1_31;
  services.k3s.serverAddr = "https://maia.internal:6443";
  services.k3s.extraFlags = [
    "--bind-address=10.200.0.8"
    "--node-ip=10.200.0.8"
    "--node-name=vala"
    "--etcd-s3"
    "--etcd-s3-config-secret=k3s-etcd-snapshot-s3-config"
  ];

  environment.systemPackages = [
    pkgs.k3s
    pkgs.openiscsi
  ];
}

Now I can provision my nodes with NixOS - which it excels at. My services are fully managed by Kubernetes. My NAS server is connected to the cluster with NFS over a 10Gbit network (however, mini pc’s top out at 2.5Gbit). My applications are managed through ArgoCD, with Gitops.

Adding minio to the mix (managed by TrueNAS) allows backing up of all Longhorn volumes and K3S hosts etcd. Hooking up Prometheus stack with Grafana enable observability. Now it does really feel like a bullet proof self-hosted services management system and a computing environment.

A few downsides to the setup compared to one beefy node are the following:

  • Dependence on NAS for storage. If NAS is down, some services will crash with it.
  • More hardware to manage. Instead of 1 computer there are now 3, as well as switches and wires connecting them all together.
  • Higher management overhead. K3S is more compute demanding than running bare Podman/Docker containers.

Managing Kubernetes cluster at home may seem like an overkill (because it is) - however, if you already have some experience (or interested enough to get it), it does not need much maintenance after the initial setup.

Before setting things up, it’s a good idea to think about these aspects upfront:

  • How to access workloads securely
  • How to manage external and internal (LAN-only) ingress
  • The best way to manage secrets
  • How to handle certificates
  • The backup strategy to implement
  • How to set up monitoring

I wish I had! For secrets, I would rather use External Secrets Operator than Vault. With External Secrets Operator I would be able to connect my existing secrets manager, with Vault, however, I have to manage my own instance. Same goes for certificates. At the time, I was interested in using Vault as a certificate authority. Now, I would rather use Letsencrypt and a DNS challenge. Adding custom CA to all devices is rather cumbersome and is not even always possible. For example, an app on my LG tv cannot connect to the service because my TV does not trust my internal CA.

Apart from that, I am very happy with how things are set-up. I am still to improve my Ingress set-up and certificates management.

It blows my mind how much I can run myself at home on open source software and a relatively cheap commodity compute that is available today. None of this would be possible without amazing open source projects that I choose to run and use day to day.