Why Talos
Kubernetes the hard way (kubeadm) requires 15+ steps and manual certificate management. Talos gives you a declarative cluster that manages its own certificates, API server rotation, and upgrades — all through a single machine config.
The trade-off: Talos is opinionated. You don’t get a traditional kubelet. But for infrastructure that should just work, this is a feature.
Air-gapped requirement: My homelab can’t reach public registries. Every container pull route redirects through my Harbor mirror. Talos’s registry mirror config makes this seamless.
Talos automatically rotates certificates before they expire. No manual intervention needed for cluster certificate management.
Module Capabilities
The tf-module-proxmox-talos module provisions a complete Talos-based Kubernetes cluster on Proxmox VE in a single Terraform apply:
- Talos Image Factory — generates custom ISOs with specific extensions
- Machine Configuration — generates Talos machine configs with networking
- ISO Upload — downloads and uploads to Proxmox datastore
- Node Provisioning — provisions control plane and worker VMs across host pool
- Cluster Bootstrap — applies machine configs and bootstraps Kubernetes
- Day-0 GitOps — optionally installs Flux or Argo CD during bootstrap
- Registry Mirrors — configures container registry redirects
Quick Start
module "talos_cluster" {
source = "registry.example.com/namespace/tf-module-proxmox-talos/talos"
version = "1.2.1"
configuration = {
cluster = {
name = "prod-k8s"
datastore = { id = "nas", node = "alpha" }
talos = { version = "v1.12.4" }
kubernetes_version = "v1.35.0"
}
host_pool = {
alpha = { datastore_id = "local-lvm" }
charlie = { datastore_id = "local-lvm" }
foxtrot = { datastore_id = "local-lvm" }
}
control_plane_nodes = {
nodes = [
{ size = "control_plane", networks = { dmz = { address = "192.168.62.21/24", gateway = "192.168.62.1" } } }
]
host_pool = ["alpha", "charlie", "foxtrot"]
vip = { enabled = true, address = "192.168.62.20" }
}
worker_nodes = {
nodes = [
{ size = "worker", networks = { dmz = { address = "192.168.62.24/24", gateway = "192.168.62.1" } } }
]
host_pool = ["alpha", "charlie", "foxtrot"]
}
node_size_configuration = {
control_plane = { cpu = 4, memory = 8192, os_disk = 128 }
worker = { cpu = 10, memory = 49152, os_disk = 128, data_disk = 512 }
}
}
}Talos Image Factory
The module uses Talos’s image factory to generate custom ISOs with specific extensions:
# image.tf
resource "talos_image_factory_schematic" "this" {
schematic = yamlencode({
customization = {
systemExtensions = {
officialExtensions = data.talos_image_factory_extensions_versions.this.extensions_info[*].name
}
}
})
}The extensions are defined in locals:
locals {
image = {
platform = "nocloud"
customizations = {
base = [
"lldp", # Network topology discovery
"qemu-guest-agent", # Proxmox agent integration
"util-linux-tools", # Core utilities
"iscsi-tools", # NFS storage
"nfs-utils" # NFS mounting
]
}
}
}The generated schematic ID is used to construct the ISO URL:
resource "proxmox_download_file" "talos_iso" {
file_name = "talos-${var.configuration.cluster.name}-${var.configuration.cluster.talos.version}-${data.talos_image_factory_urls.this.schematic_id}.iso"
url = var.configuration.cluster.talos.iso_mirror != null
? replace(data.talos_image_factory_urls.this.urls.iso, "https://", var.configuration.cluster.talos.iso_mirror)
: data.talos_image_factory_urls.this.urls.iso
}This allows using mirror registries for air-gapped environments.
Machine Configuration
Talos machine configuration is generated through the Talos provider:
data "talos_machine_configuration" "configurations" {
cluster_name = var.configuration.cluster.name
cluster_version = var.configuration.cluster.talos.version
# Control plane specific config
machine_type = "controlplane"
# Network configuration
network = {
interfaces = [
for idx, network in var.configuration.control_plane_nodes.nodes[0].networks : {
interface = keys(network.networks)[0]
DHCP = false
addresses = [values(network.networks)[0].address]
}
]
}
# Kubernetes configuration
kubernetes = {
version = var.configuration.cluster.kubernetes_version
}
}The configuration supports:
- Multiple network interfaces per node
- Registry mirrors for all major registries
- Custom CNI (Cilium) configuration
- kube-proxy disable
Registry Mirrors
A key feature is container registry mirror configuration:
configuration = {
cluster = {
registry_mirrors = {
"ghcr.io" = {
endpoints = ["https://harbor.example.com/v2/gh"]
override_path = true
}
"registry.k8s.io" = {
endpoints = ["https://harbor.example.com/v2/k8s"]
override_path = true
}
"docker.io" = {
endpoints = ["https://harbor.example.com/v2/dh"]
override_path = true
}
"quay.io" = {
endpoints = ["https://harbor.example.com/v2/qi"]
override_path = true
}
"factory.talos.dev" = {
endpoints = ["https://harbor.example.com/v2/talos"]
override_path = true
}
}
}
}All container pulls route through my Harbor registry — essential for air-gapped homelabs.
Multi-Network Support
The module provisions VMs with multiple network interfaces:
network_devices = [
for network_name, network in each.value.networks : {
name = network_name
enabled = true
bridge = network_name
ipv4 = {
address = network.address
gateway = network.gateway
}
}
]My production setup uses:
- dmz — frontend network with gateway (192.168.62.0/24)
- vmbr1 — backend network for inter-node communication (192.168.192.0/24)
Cluster Bootstrap
The bootstrap sequence is orchestrated by Terraform:
# 1. Generate machine secrets
resource "talos_machine_secrets" "this" {}
# 2. Apply control plane configuration
resource "talos_machine_configuration_apply" "controlplane" {
for_each = { for idx, node in var.configuration.control_plane_nodes.nodes : idx => node }
node = module.control_plane_virtual_machine[each.key].virtual_machine.id
config = data.talos_machine_configuration.configurations[each.key].machine_config
secrets = talos_machine_secrets.this.secrets
}
# 3. Bootstrap the cluster
resource "talos_machine_bootstrap" "this" {
node = var.configuration.control_plane_nodes.nodes[0].name
config = data.talos_machine_configuration.configurations[0].machine_config
secrets = talos_machine_secrets.this.secrets
}GitOps Bootstrap
One of the most powerful features — Flux or ArgoCD can be bootstrapped during cluster creation:
configuration = {
cluster = {
gitops = {
provider = "flux" # or "argocd"
namespace = "flux-system"
chart_version = "2.18.2"
bootstrap = {
repo_url = "https://github.com/your-org/applications.git"
revision = "main"
path = "src/k8s/prod"
destination_namespace = "homelab"
}
}
}
}This does:
- Installs Flux during Talos bootstrap (via inline manifest)
- Configures it to sync from the applications-homelab repository
- The cluster starts deploying apps immediately after boot
sequenceDiagram
participant TF as Terraform
participant Talos as Talos
participant Flux as Flux
participant GH as GitHub
participant K8s as Kubernetes
TF->>Talos: Apply machine config
Talos->>Talos: Bootstrap control plane
Talos->>Flux: Install Flux CRDs
Flux->>GH: Clone applications-homelab
GH-->>Flux: Return repo contents
Flux->>K8s: Deploy applications
Cilium Integration
For advanced networking, the default CNI can be replaced with bundled Cilium:
configuration = {
cluster = {
# Disable Talos-managed CNI
options = {
disable_default_cni = true
disable_kube_proxy = true
}
# Configure Cilium via helm values
helm_values_override = {
cilium = {
operator = { replicas = 1 }
}
}
}
}The module uses the Helm provider to template the Cilium manifest:
data "helm_template" "cilium" {
name = "cilium"
repo = "https://cilium.github.io/cilium"
chart = "cilium"
version = var.configuration.cluster.talos.version
namespace = "cilium"
values = [var.configuration.cluster.helm_values_override]
}Node Sizing
The node_size_configuration block keeps definitions DRY:
node_size_configuration = {
control_plane = {
cpu = 4
memory = 8192 # MB
os_disk = 128 # GB
}
worker = {
cpu = 10
memory = 49152 # MB
os_disk = 128
data_disk = 512 # Extra data disk for PVs
}
}My prod-k8s cluster:
- 3 control plane nodes: 4 vCPU, 8GB RAM, 128GB disk
- 3 worker nodes: 10 vCPU, 48GB RAM, 128GB OS + 512GB data
Host Pool Scheduling
VMs are distributed across Proxmox nodes via modulo arithmetic:
# In nodes.tf
node_name = var.configuration.control_plane_nodes.host_pool[
each.key % length(var.configuration.control_plane_nodes.host_pool)
]With 3 nodes and 6 worker indices:
- Worker 0 → alpha (0 % 3)
- Worker 1 → charlie (1 % 3)
- Worker 2 → foxtrot (2 % 3)
- Worker 3 → alpha (3 % 3)
- Worker 4 → charlie (4 % 3)
- Worker 5 → foxtrot (5 % 3)
This ensures even distribution across the cluster.
Outputs
The module returns cluster credentials for external use:
output "cluster_credentials" {
value = {
kubeconfig = talos_cluster_kubeconfig.this.kubeconfig
talosconfig = talos_client_configuration.this.talosconfig
# Kubeconfig file is also written locally when debug = true
talosconfig_path = local.talosconfig_path
kubeconfig_path = local.kubeconfig_path
}
}Credentials are automatically stored in Bitwarden:
resource "bitwarden-secrets_secret" "kubernetes_kubeconfig" {
key = "${local.cluster_name}-kubeconfig"
value = module.kubernetes[0].cluster_credentials.kubeconfig
}
resource "bitwarden-secrets_secret" "kubernetes_talosconfig" {
key = "${local.cluster_name}-talosconfig"
value = module.kubernetes[0].cluster_credentials.talosconfig
}My Production Configuration
Here’s the actual production YAML configuration:
# configurations/kubernetes/prod-k8s.yaml
cluster:
name: prod-k8s
datastore:
id: nas
node: alpha
talos:
version: v1.12.4
installer_mirror: harbor.example.com/talos
iso_mirror: https://proxy.example.com/
kubernetes_version: v1.35.0
registry_mirrors:
ghcr.io: { endpoints: [https://harbor.example.com/v2/gh], override_path: true }
registry.k8s.io: { endpoints: [https://harbor.example.com/v2/k8s], override_path: true }
docker.io: { endpoints: [https://harbor.example.com/v2/dh], override_path: true }
quay.io: { endpoints: [https://harbor.example.com/v2/qi], override_path: true }
factory.talos.dev: { endpoints: [https://harbor.example.com/v2/talos], override_path: true }
options:
disable_default_cni: true
disable_kube_proxy: true
disable_scheduling_on_control_plane: true
gitops:
provider: flux
bootstrap:
repo_url: https://github.com/your-org/applications.git
path: src/k8s/prod
destination_namespace: homelab
host_pool:
alpha: { datastore_id: local-lvm }
charlie: { datastore_id: local-lvm }
foxtrot: { datastore_id: local-lvm }
control_plane_nodes:
nodes: [...] # 3 control planes
host_pool: [alpha, charlie, foxtrot]
vip: { enabled: true, address: 192.168.62.20 }
worker_nodes:
nodes: [...] # 3 workers
host_pool: [alpha, charlie, foxtrot]
node_size_configuration:
control_plane: { cpu: 4, memory: 8192, os_disk: 128 }
worker: { cpu: 10, memory: 49152, os_disk: 128, data_disk: 512 }What’s Next
Current areas of exploration:
- Multi-cluster federation — connecting Talos clusters for workload distribution
- Nested Talos — running Talos inside Proxmox for testing
- Observability — centralized logging with Loki and Grafana
What Most People Get Wrong
-
“Talos upgrades break clusters” — With proper machine configs and registry mirrors, upgrades are rolling. The immutability is a feature, not a bug.
-
“Air-gapped is impossible” — Talos’ registry mirror config + image factory handles this. Your nodes don’t need public internet access.
-
“No kubelet means no logging” — Talos has built-in
talosctl logsandtalosctl metrics. It’s different from Kubernetes logging, not less capable.
When to Use / When NOT to Use
| Use Talos | Stick with kubeadm |
|---|---|
| Want declarative infrastructure | Need full kubelet control |
| Air-gapped environments | Custom init systems required |
| Single apply to cluster | Manual certificate management needed |
Use Talos
| Use Talos | Stick with kubeadm |
|---|---|
| Want declarative infrastructure | Need full kubelet control |
| Air-gapped environments | Custom init systems required |
| Single apply to cluster | Manual certificate management needed |
The foundation is solid — every cluster can be versioned, reviewed, and rolled back.