What I learned running K3s on NixOS for 2+ years
I started using NixOS for media management and storage needs. I wish I could say I sat down and made a conscious decision after in-depth research and I decided NixOS was the best OS to suit my needs. However, it was not the case. I just read through some blog posts and thought it’d be nice to experiment with NixOS again. I developed my configs for multiple machines over time and was happy with amount of effort I spent on maintaining the codebase compared to advantages it gave me. Configuring a user and dotfiles in a single codebase and apply to both Linux servers and macOS is incredibly powerful.
I use flakes to build my systems and nixos-anywhere to provision new hardward / virtual machines.
I decided to run my home services on a Kubernetes cluster and heard great things about K3s. I already have good experience with Kubernetes and a huge collection of applications (Helm charts in ArgoCD apps) I configured to run on my cluster. After reading a few blogs on K3s on NixOS I decided to give it a go and had a Kubernetes node in no time.
K3s on NixOS
K3s came with all the bells and whistles I need - Traefik ingress and an ArgoCD instance as well as storage CSI (it uses local-path-provisioner which allows host mounted volumes via CSI plugin).
What I really enjoyed about running K3s is you can specify Kubernetes secret path on the cluster that contains S3 compatible backend secrets and it will backup Etcd upon launch (and on schedule).
Service support for K3s on NixOS is rock solid and is regularly up-to-date. With a few caveats as outlined below it made for a great Kubernetes cluster with OS defined declaratively.
Longhorn POSIX-style paths expectations
For storage, my preference was Longhorn and so I deployed it with Helm. However, it wouldn’t run on NixOS due to non-standard binaries paths (Longhorn uses host tooling for managing nfs and iscsi to mount volumes). Luckily, as in a lot of cases with NixOS, someone stumbled on this problem before and posted a solution that applies Kyverno policy to modify PATH of Longhorn containers so that it finds the binaries it needs.
Stateful applications in declarative OS
To add another instance to the cluster I had to specify a static token. Everything being declarative in NixOS this token had to live in a configuration file.
To form the cluster here’s what I did:
- Defined static token in original node nix configuration
- Added the same token to the new node
- Applied configuration on both
- Removed static tokens from configuration
It’s an awkward process. Later I realised I could manually run K3s command on the new node, with the token, to join the cluster. Then, I’d stop the node and apply my nix configuration without the hardcoded token. K3s state would re-use the token that was stored on the host when the first (manual) service start was executed. This is where declarative nature of nix clashes with inherently stateful applications. Even though my cluster was not exposed to the internet and commiting the token to the private repository would not be a big security risk this was still not desirable.
I realise now NixOS service configuration for K3s does include services.k3s.tokenFile, so you don’t have to expose your token in the code (or nix-store) to make this work!
nixpkgs-unstable and living on the bleeding edge
Upgrading cluster was as easy as running nixos-rebuild on all nodes. It worked most of the time but it would often take down nodes in my cluster. This was entirely my fault as I YOLO’d upgrades quite often. I would run nixos-rebuild on a live system and more often than not it worked fine, but occasionally it would starve node of resources (especially if nixos is compiling a package instead of pulling cache), etcd would refuse to serve the cluster and the node would become inoperational.
I’ve made quite a few mistakes here. First, not draining and cordoning the node before upgrade. Second, control plane nodes were running workloads and in addition to that - running on a SATA SSD, so available IOPS were terrible to begin with. In addition to being a minipc’s with N100 (4 cores) this is often a nail in the coffin for Etcd. Etcd has very low tolerance for latency and pegging CPU and starving disk of IOPS is a terrible combination.
Updating flake.nix nixpkgs-unstable to the latest commit and running nixos-rebuild would sometimes require to compile a package because cache was unavailable (eg MongoDB is not cached as well as Vault and Terraform due to their license). But even then, some upgrades just seemed quite expensive (I could not figure out why). In hindsight, using a beefier server for builds was worth doing - it’s possible to build nix store on one machine and push it to the other, but all I had was crappy mini pc’s available.
nixpkgs-unstable… again
No one is immune to that and it’s not nix’s fault of course but bleeding edge would occasionally introduce headscratchers. My favourite one is ConfigMap SubPath Volume Mount failed on util-linux=2.41 #130999 in Kubernetes.
I upgraded a node and restarted the services, however Vault agent would not be able to mount secrets to the container. It was working on other nodes, but not on this particular one. I spent a few short evenings troubleshooting this until I stumbled on this issue that was also raised in K3s repository.
Living on nixpkgs-unstable will expose you to these occasional issues and there’s unfortunately no way to avoid it.
Mixing laptop and server configurations
A great thing about NixOS configuration (at least with flakes) is storing all your computers configuration in a single repository. Define a single flake.nix file, split configuration into multiple hosts and users and now you can manage anything that runs NixOS or nix darwin modules in one repository.
However, if a package I depend on a macOS is updated often I will pin nixpkgs-unstable to the latest version. It means that next rebuild will upgrade the whole system (even if I just want to add another package to a system). It is certainly possible to define different nixpkgs inputs for different systems. However, it will duplicate whole nixpkgs tree and will consume more memory during rebuilds. There are also overlays, but I never got a good grip on them. I had to carefully manage my flake and plan my nixos-rebuild runs. Luckily, my K3s nodes were relatively slim and it was not a huge problem.
Conclusion
I had no issues managing K3s cluster with all computers running NixOS. This allowed me to easily switch my server purpose from general server to NAS server to K3s server and adding extra nodes to form the cluster.
K3s plays nicely with NixOS in general and maintainers do a great job keeping it up to date. I would recommend to run rebuilds on a different server, especially if your system is resource constrained.
Additionally, it makes sense to split work / personal computer repository from your server configuration repository. Server update cadence is different and requires more consideration. Being a bit fast and loose on your personal computer is less of an issue (simply reboot into previous configuration - one of the greatest things about NixOS) compared to server where production is impacted (even if it’s just a home instance of your media server).