Hybrid Talos Kubespan cluster - control plane nodes terraform
Phase 0 - prerequisites. Deploying Hetzner control plane instances
In this phase I will deploy Hetzner control plane nodes via terraform.
I created a terraform module that allows customizing instances types and other aspects of the control plane nodes. This is a simple terraform module that relies on hetzner terraform provider.
Here are the resources I will need to provision control plane stack:
- hcloud_network
- hcloud_network_subnet
- hcloud_firewall
- hcloud_ssh_key
- hcloud_server
Terraform tfvars file
For easier configuration of the module, I created tfvars file with the following parameters:
root_domain = ""
# Kubernetes API domain name
kubernetes_api_domain = "" # k8s api dns prefix
# Hetzner Cloud configuration for Talos
hcloud_token = ""
hcloud_region = "fsn1"
hcloud_control_plane_type = "cx32"
# Talos configuration
cluster_name = "homelab-talos"
talos_version = "v1.10.1"
control_plane_count = 1
worker_count = 0
# Network configuration
network_cidr = "10.0.0.0/8"
servers_subnet_cidr = "10.0.1.0/24"
My API DNS entry is <kubespan_api_domain>.<root_domain>. hcloud_token is required to communicate with hcloud API via the provider.
Hcloud network and subnet
I setup /8 CIDR network with 10.0.1.0/24 for my control plane subnet. This is configured via terraform.tfvars file:
network_cidr = "10.0.0.0/8"
servers_subnet_cidr = "10.0.1.0/24"
Hcloud firewall
I need to ensure the firewall rules will allow KubeSpan to set-up node communication. Here’s what it looks like:
resource "hcloud_firewall" "cluster_firewall" {
name = "${var.cluster_name}-firewall"
# Allow internal traffic
rule {
direction = "in"
protocol = "tcp"
port = "any"
source_ips = [var.network_cidr]
}
rule {
direction = "in"
protocol = "udp"
port = "any"
source_ips = [var.network_cidr]
}
# Allow SSH from anywhere
rule {
direction = "in"
protocol = "tcp"
port = "22"
source_ips = ["0.0.0.0/0", "::/0"]
}
# Allow Kubernetes API
rule {
direction = "in"
protocol = "tcp"
port = "6443"
source_ips = ["0.0.0.0/0", "::/0"]
}
# Allow Kubespan
rule {
direction = "in"
protocol = "udp"
port = "51820"
source_ips = ["0.0.0.0/0", "::/0"]
}
# Allow Talos API
rule {
direction = "in"
protocol = "tcp"
port = "50000"
source_ips = ["0.0.0.0/0", "::/0"]
}
# Allow Talos API alternate port
rule {
direction = "in"
protocol = "tcp"
port = "50001"
source_ips = ["0.0.0.0/0", "::/0"]
}
# ICMP
rule {
direction = "in"
protocol = "icmp"
source_ips = ["0.0.0.0/0", "::/0"]
}
}
Note that this allows 22 from anywhere. Since I only need ssh for initial bootstrapping. Talos machines do not have SSH process running, so having 22 open poses no risk once control plane nodes are provisioned. I open ICMP, UDP port 51820 for KubeSpan, 50000/50001 for Talos process communication and 6443 for Kubernetes API access. In the long run, I’d like to setup tailscale authentication for Kubernetes API, which is possible via Tailscale Kubernetes operator.
Hcloud server and ssh key
# SSH key for the servers
resource "hcloud_ssh_key" "default" {
name = "${var.cluster_name}-key"
# TODO: make this configurable
public_key = file("~/.ssh/id_ed25519.pub")
}
# Control plane nodes
resource "hcloud_server" "control_plane" {
count = var.control_plane_count
name = "${var.cluster_name}-control-plane-${count.index + 1}"
server_type = var.hcloud_control_plane_type
# Start with a basic image - we'll install Talos via rescue mode
image = "debian-12"
location = var.hcloud_region
ssh_keys = [hcloud_ssh_key.default.id]
firewall_ids = [hcloud_firewall.cluster_firewall.id]
delete_protection = true
rebuild_protection = true
network {
network_id = hcloud_network.network.id
ip = cidrhost(var.servers_subnet_cidr, count.index + 1)
}
# We're using local provisioners in talos.tf to configure the nodes
# No need for cloud-init user_data
depends_on = [hcloud_network_subnet.cloud_subnet]
}
I’ll create debian 12 box. It’s only required for initial Talos bootstrapping.
Running plan apply.
I prefer to use GNUmakefile for most common used commands.
This way, I can run all commands from the root folder of the project.
For tofu-plan, I run make tf-plan
For tofu apply, I run:
make tf-apply
with auto-approve make tf-apply AUTO_APPROVE=true
Check what tf-apply does at tf-apply
Phase 1 - installing Talos.
In the next section, I’ll describe how to install Talos OS on the control plane nodes.
Code.
The code used to deploy the cluster is available via Github - sashkachan/talos-kubespan-bootstrap. I will use this code for the walkthrough of all phases and configuration required to make it succeed.