Proxmox VE + full Kubernetes (kubeadm) step-by-step

 

Production-ready guide: Proxmox VE + full Kubernetes (kubeadm) step-by-step

A practical, end-to-end walkthrough to deploy a hardened Proxmox VE environment and run a production-grade Kubernetes control plane and worker nodes (kubeadm). Includes HA control-plane, etcd considerations, load-balancing, networking (Calico), storage (Longhorn + option for Ceph), security hardening, backups, and monitoring.

This blog assumes you have basic Linux + virtualization familiarity and a small cluster of physical servers for Proxmox (3+ nodes recommended for HA). Where choices exist I explain trade-offs and give concrete commands and config snippets you can copy.


Outline

  1. Goals & prerequisites

  2. Logical architecture (ASCII diagram)

  3. Prepare Proxmox VE hosts (install & base hardening)

  4. Proxmox networking and storage planning

  5. Create cloud-init VM template for Kubernetes nodes

  6. Deploy VMs for Kubernetes control plane and workers

  7. Provision HA load balancer for kube-api (HAProxy + keepalived)

  8. Install Kubernetes control plane (kubeadm) with HA (stacked vs external etcd)

  9. Join worker nodes & configure CNI (Calico)

  10. Cluster storage (Longhorn) and persistent volumes

  11. Ingress, TLS (cert-manager + Traefik/NGINX), and external DNS

  12. Security hardening (Proxmox + Kubernetes best practices)

  13. Backups, monitoring, logging, and upgrade strategy

  14. Checklist & references (commands and file snippets)


1) Goals & prerequisites

Goals

  • Production-grade Kubernetes using kubeadm (not a lightweight distro).

  • HA control plane (3 masters), HA kube-apiserver fronted by virtual IP/load-balancer.

  • Secure Proxmox host and VMs, segregated management network.

  • Enterprise features: RBAC, NetworkPolicies, TLS, monitoring, PV storage.

Prerequisites

  • 3 (or more) physical servers for Proxmox VE (64-bit CPUs with virtualization support, 32-64 GB+ RAM each recommended).

  • Proxmox VE installed on each server (Debian-based).

  • A management network (private VLAN) for Proxmox/corosync and node management.

  • DNS entries for hostnames and a DNS or at least /etc/hosts control for cluster components.

  • SSH access to Proxmox root or an admin user with sudo.

  • Basic packages: curl, jq, ssh, ufw (optional), iptables/nft familiarity.

  • Optional but recommended: Proxmox Backup Server (PBS) for VM backups.


2) Logical architecture (simple ASCII)

Internet | +-- LB (VIP: 203.0.113.10) <-- keepalived VIP + HAProxy | ------------------------- | Proxmox Node A | <--- runs VMs (k8s-master-1, k8s-worker-1, ...) | (pve-a.example.com) | ------------------------- | Proxmox Node B | <--- runs VMs (k8s-master-2, k8s-worker-2) ------------------------- | Proxmox Node C | <--- runs VMs (k8s-master-3, k8s-worker-3) ------------------------- Each Kubernetes master VM connects to HAProxy VIP for API server (TCP 6443) Storage: Longhorn (K8s PVs) or Ceph (Proxmox backed) Monitoring: Prometheus/Grafana; Logging: EFK/Opensearch

3) Prepare Proxmox hosts — install & base hardening

Perform on each physical server.

3.1 Install/Update Proxmox

Follow Proxmox install media; after install, on each node:

apt update && apt full-upgrade -y pveversion

3.2 Secure SSH & root access

Edit /etc/ssh/sshd_config:

PermitRootLogin prohibit-password PasswordAuthentication no PermitEmptyPasswords no ChallengeResponseAuthentication no UsePAM yes

Add admin user, add your SSH key, and grant sudo:

adduser admin usermod -aG sudo admin mkdir -p /home/admin/.ssh echo "ssh-rsa AAAA…yourkey…" >> /home/admin/.ssh/authorized_keys chown -R admin:admin /home/admin/.ssh chmod 700 /home/admin/.ssh chmod 600 /home/admin/.ssh/authorized_keys

Restart SSH: systemctl restart sshd

3.3 Enable and configure Proxmox firewall

In Proxmox GUI: Datacenter → Firewall → enable. Also enable per-node firewall. Start with conservative rules: allow management subnet only, deny other incoming.

Example iptables rule (if using host-level):

# accept management subnet pve-firewall local add IN ACCEPT -source 10.10.0.0/24 # allow webUI only from mgmt net # drop all else

Use the GUI for finer-grained VM-level rules.

3.4 Two-factor authentication

Enable TOTP (Datacenter → Permissions → Two-Factor). Require 2FA for admin accounts.

3.5 Monitoring & Auditing

Configure syslog forwarding (Graylog/ELK/SIEM) from Proxmox:

# /etc/rsyslog.d/99-remote.conf *.* @@graylog.mgmt.example.com:5140 systemctl restart rsyslog

4) Networking & storage planning in Proxmox

4.1 Network layout

  • vmbr0 — management (Proxmox web UI, migrations) — on private VLAN.

  • vmbr1 — VM public/production network.

  • vmbr2 — storage network (optional, for Ceph/RBD / Longhorn replication).

Ensure corosync uses a private dedicated interface to avoid interference.

4.2 Storage choices

  • For small-medium: Proxmox ZFS on local nodes + Proxmox Backup Server, or use Longhorn inside K8s for PVs.

  • For enterprise-scale: Ceph on separate OSDs integrated with Proxmox (RBD) or external SAN.

Recommendation: Use Longhorn (K8s-native distributed block storage) for simplicity unless you already run Ceph.


5) Create cloud-init VM template for Kubernetes nodes

We use cloud-init to speed provisioning. Create a minimal Ubuntu 22.04/24.04 LTS cloud-init template.

5.1 Example cloud-init userdata

Create user-data for cloud-init:

#cloud-config preserve_hostname: False hostname: k8s-node ssh_authorized_keys: - ssh-rsa AAAA…yourkey… users: - name: ubuntu sudo: ALL=(ALL) NOPASSWD:ALL shell: /bin/bash ssh_authorized_keys: - ssh-rsa AAAA…yourkey… chpasswd: list: | ubuntu:changeme expire: false package_update: true packages: - apt-transport-https - ca-certificates - curl - gnupg - lsb-release runcmd: - [ sh, -c, 'swapoff -a' ] - [ sh, -c, 'sed -i "/ swap / s/^/#/" /etc/fstab' ]

Important: swap must be disabled for kubelet.

Create a Proxmox VM from Ubuntu cloud ISO, configure cloud-init drive via GUI, convert to template, and then clone for masters/workers. (Reference: the Cloudfleet article on Proxmox cloud-init usage. Cloudfleet+1)


6) Deploy VMs for control plane and workers

Suggested VM sizing

  • Master: 4 vCPU, 16–32 GB RAM, 50 GB root disk (adjust to workload), 2 NICs (management + data).

  • Worker: 4 vCPU, 8–16 GB RAM, variable disk.

Create 3 master VMs: k8s-master-1, k8s-master-2, k8s-master-3. Create N worker VMs as needed.

Set static IPs (via cloud-init or in DNS) and verify SSH connectivity.


7) HA API server: keepalived + HAProxy (or metalLB for LB if you have cloud IPs)

We create an HA pair (or run on separate small VMs) to expose a virtual IP (VIP) for kube-apiserver. VIP forwards 6443 to all control-plane endpoints. (See HA topology guidance. Kubernetes+2Flatcar+2)

7.1 Install keepalived + haproxy (example on LB nodes)

apt update && apt install -y keepalived haproxy

keepalived.conf (example for VIP 10.10.0.100):

vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 51 priority 101 advert_int 1 authentication { auth_type PASS auth_pass secretpass } virtual_ipaddress { 10.10.0.100/24 } }

HAProxy config (only show API forwarding):

frontend k8s_api bind *:6443 mode tcp option tcplog default_backend k8s_api_back backend k8s_api_back mode tcp balance roundrobin server master1 10.10.0.11:6443 check fall 3 rise 2 server master2 10.10.0.12:6443 check fall 3 rise 2 server master3 10.10.0.13:6443 check fall 3 rise 2

Restart services: systemctl restart keepalived haproxy

Result: VIP (10.10.0.100) becomes the kube-apiserver endpoint.


8) Install Kubernetes control plane with kubeadm (HA)

We’ll use kubeadm to create a stacked control plane (etcd runs on masters). For production at scale, consider external etcd cluster. (See official docs. Kubernetes+1)

8.1 Pre-reqs on each master VM

On each master VM (run as root or sudo):

# disable swap already done; ensure apt is fresh apt update && apt upgrade -y # install container runtime: containerd (recommended) apt install -y ca-certificates curl gnupg lsb-release mkdir -p /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \ https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" \ | tee /etc/apt/sources.list.d/docker.list > /dev/null apt update apt install -y containerd.io # generate default config for containerd containerd config default > /etc/containerd/config.toml systemctl restart containerd systemctl enable containerd # kernel modules and sysctl for k8s networking cat <<EOF | tee /etc/modules-load.d/k8s.conf br_netfilter overlay EOF modprobe br_netfilter modprobe overlay cat <<EOF | tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF sysctl --system # install kubeadm, kubelet, kubectl curl -fsSLo /usr/share/keyrings/kubernetes-archive-keyring.gpg \ https://packages.cloud.google.com/apt/doc/apt-key.gpg echo "deb [signed-by=/usr/share/keyrings/kubernetes-archive-keyring.gpg] \ https://apt.kubernetes.io/ kubernetes-xenial main" | tee /etc/apt/sources.list.d/kubernetes.list apt update apt install -y kubelet kubeadm kubectl apt-mark hold kubelet kubeadm kubectl

8.2 kubeadm config for HA (control-plane init)

Create kubeadm-config.yaml (run on the first master):

apiVersion: kubeadm.k8s.io/v1beta3 kind: ClusterConfiguration kubernetesVersion: stable controlPlaneEndpoint: "10.10.0.100:6443" # VIP from HAProxy/keepalived networking: podSubnet: "192.168.0.0/16" # align with your CNI choice --- apiVersion: kubeadm.k8s.io/v1beta3 kind: InitConfiguration localAPIEndpoint: advertiseAddress: "10.10.0.11" # master1 IP bindPort: 6443 nodeRegistration: name: k8s-master-1 kubeletExtraArgs: node-labels: "node-role.kubernetes.io/master=" authorization-mode: "AlwaysAllow"

Run on master1:

kubeadm init --config=kubeadm-config.yaml --upload-certs

kubeadm init will print:

  • kubeadm join command for additional control-plane nodes (with --control-plane) — save it.

  • kubeadm join worker command (without control-plane) — save it.

Set up kubeconfig for admin:

mkdir -p $HOME/.kube cp -i /etc/kubernetes/admin.conf $HOME/.kube/config chown $(id -u):$(id -g) $HOME/.kube/config

8.3 Join remaining control plane nodes

On master2 and master3 run the kubeadm join … --control-plane command printed earlier. Example (replace tokens and addresses):

kubeadm join 10.10.0.100:6443 --token <token> \ --discovery-token-ca-cert-hash sha256:<hash> \ --control-plane --certificate-key <cert-key>

After joining all control plane nodes, verify from any master:

kubectl get nodes kubectl get pods -n kube-system

Note about etcd: The above uses stacked etcd (etcd runs as static pods on masters). For larger enterprise clusters, consider using an external etcd cluster or managed control plane for resilience and easier recovery. (See docs. Kubernetes)


9) Join worker nodes & configure CNI (Calico)

9.1 Join workers

On each worker VM, install containerd and kubeadm/kubelet as above (see prereqs). Run the worker kubeadm join token printed earlier (the non-control-plane join command).

kubeadm join 10.10.0.100:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>

Verify node readiness:

kubectl get nodes

9.2 Install a CNI — Calico example

Calico provides networking + NetworkPolicy enforcement and is production-ready.

kubectl apply -f https://docs.projectcalico.org/manifests/tigera-operator.yaml kubectl apply -f https://docs.projectcalico.org/manifests/custom-resources.yaml

(If you cannot fetch remote manifests in a locked-down environment, download and adapt the manifests.)

After CNI install verify pods in kube-system are running and nodes show Ready:

kubectl get pods -n kube-system kubectl get nodes

10) Cluster storage: Longhorn (recommended for small/medium on-prem)

Longhorn is a Kubernetes-native distributed block storage system.

10.1 Install Longhorn

You can install Longhorn via Helm or the YAML from Longhorn.

kubectl create namespace longhorn-system helm repo add longhorn https://charts.longhorn.io helm repo update helm install longhorn longhorn/longhorn --namespace longhorn-system

After installation:

  • Configure Longhorn UI (Service type LoadBalancer or NodePort).

  • Check Longhorn > Settings for replica count and node scheduling.

Set default StorageClass to Longhorn so PVCs use it by default.


11) Ingress, TLS, External DNS

11.1 Install cert-manager

Install via YAML:

kubectl apply --validate=false -f https://github.com/cert-manager/cert-manager/releases/download/v1.15.0/cert-manager.yaml

Create ClusterIssuer for Let’s Encrypt (staging first, then production) or use internal CA.

11.2 Ingress controller (Traefik or NGINX)

Example NGINX ingress via Helm:

helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx helm repo update helm install ingress-nginx ingress-nginx/ingress-nginx --namespace ingress-nginx --create-namespace

Create Ingress resources and use cert-manager Certificate to provision TLS via Let’s Encrypt.

11.3 External DNS

If you have a DNS provider, you can use external‑dns to automatically create records when you create Services/Ingress. Configure provider secrets (Cloudflare, AWS Route53, etc.).


12) Security hardening (Proxmox + Kubernetes)

12.1 Proxmox hardening checklist

  • Keep Proxmox & kernel updated; test kernel updates in maintenance windows.

  • Restrict management access (UI + SSH) to mgmt subnet or VPN.

  • Enable 2FA for GUI accounts.

  • Use Proxmox Backup Server for VM backups with encryption.

12.2 Kubernetes hardening checklist

  • Ensure RBAC is enabled (default).

  • Apply Pod Security Standards (PSA) or use OPA/Gatekeeper policies.

  • Use NetworkPolicies to restrict pod-to-pod traffic; default deny for namespaces.

  • Run non-root containers and read-only root FS where possible.

  • Limit service account privileges; use kubectl auth can-i to audit.

  • Use Admission Controllers (ResourceQuota, LimitRanger).

  • Encrypt etcd at rest — ensure --encryption-provider-config is set.

  • Rotate certificates & tokens, use short-lived tokens where possible.

  • Use image scanning (Trivy/Clair) in CI pipeline.

  • Integrate with enterprise identity: OIDC for API server auth (e.g., Dex + AD/LDAP) or use Kubernetes RBAC bound to groups in your IdP.

Example enable encryption config (etcd encryption):

Create /etc/kubernetes/encryption-config.yaml:

kind: EncryptionConfig apiVersion: v1 resources: - resources: - secrets providers: - aescbc: keys: - name: key1 secret: <base64-encoded 32 bytes> - identity: {}

Refer to kubeadm docs to add --encryption-provider-config to the API server manifest (see official docs. Kubernetes)

12.3 Network hardening

  • Use Calico network segmentation and egress policies.

  • Use firewall rules on Proxmox to limit management networks; restrict control-plane ports (6443) to LB and admin nets.


13) Backups, monitoring, logging, upgrade plan

13.1 Backups

  • Use etcd snapshotting and off-site backup copies.

  • Back up kubeadm PKI (certs) and /etc/kubernetes/admin.conf.

  • For VMs: use Proxmox Backup Server snapshots & scheduled backups.

  • For cluster state: use Velero for namespaced resource backups and PV snapshot integration (Longhorn supports snapshots).

Velero example:

velero install --provider aws --bucket <bucket> --secret-file ./credentials-velero --use-restic

(Configure provider according to environment.)

13.2 Monitoring & logging

  • Monitoring: Prometheus + Grafana (e.g., kube-prometheus-stack via Helm).

  • Logging: EFK (Elasticsearch/Opensearch + Fluentd/Fluentbit + Kibana/Grafana). Consider hosted/central logging.

  • Alerting: Alertmanager integrated with Slack/PagerDuty.

13.3 Upgrades & maintenance

  • Test upgrades in staging.

  • Control-plane first (one master at a time), drain workers for kubelet/kube-proxy upgrades.

  • Use kubeadm upgrade plan and kubeadm upgrade apply.

  • Keep container runtime & OS patched.


14) Quick scripts & useful commands

Create VM from template (pvesh / qm CLI)

Example clone:

# clone template 9000 to vmid 100 qm clone 9000 100 --name k8s-master-1 --full true qm set 100 --cores 4 --memory 16384 --net0 virtio,bridge=vmbr0 qm resize 100 scsi0 +20G qm set 100 --ciuser ubuntu --cipassword changeme --ipconfig0 ip=10.10.0.11/24,gw=10.10.0.1 qm start 100

kubeadm: view join commands later

On a control plane node:

kubeadm token create --print-join-command # or for control-plane join with certificate key kubeadm init phase upload-certs --upload-certs

Validate cluster

kubectl get nodes -o wide kubectl get cs kubectl top nodes kubectl get pods -A

15) Example recovery notes

  • If etcd fails: use etcdctl snapshot save backups and etcdctl snapshot restore.

  • If you lose control plane: restore etcd snapshot to fresh control plane nodes, rejoin workers.

  • Document procedure for disaster recovery; practice restores in staging.


16) Example folder of configs (short list)

  • kubeadm-config.yaml — cluster config

  • haproxy.cfg — API LB config

  • keepalived.conf — VIP config

  • containerd config — tuned for kube

  • encryption-config.yaml — etcd encryption

  • networkpolicy-deny-all.yaml — default deny template

  • pod-security-standards.yaml — PSA templates

  • velero-backup.yaml — Velero schedules


17) Final checklist before “go-live”

  • Proxmox nodes patched and backups configured (PBS).

  • Management network isolated and firewall rules applied.

  • VM templates built with cloud-init; swap disabled.

  • HA API VIP + load balancer verified (failover tested).

  • 3 control-plane nodes online and kubectl get nodes shows Ready.

  • CNI installed and all pods in kube-system are Running.

  • Longhorn (or chosen storage) installed and default StorageClass set.

  • cert-manager + ingress installed; TLS validated for a test app.

  • RBAC policies and NetworkPolicies applied; default deny in test namespaces.

  • Monitoring, logging, alerting integrated; alerts tested.

  • Backup & restore plan documented and tested (Velero + etcd snapshot).

  • Upgrade and node replacement playbook written and rehearsed.


Conclusion

Running full Kubernetes (kubeadm) on VMs inside Proxmox gives you maximum control and enterprise-grade features while keeping the infrastructure flexible. The recipe above is intentionally conservative: HA API via VIP + HAProxy, 3+ control-plane nodes with stacked etcd, Calico for networking, Longhorn for storage, cert-manager + ingress for TLS, and production safety via backups, monitoring, and hardened Proxmox host.

Links & References

Visit my blog for more content:

Virtology Blog

Comments

Popular posts from this blog

Monitoring Virtualized Environments with Graylog: A Complete Guide

Building a Secure Virtual OPNsense 26.1 Firewall with VLANs, DMZ, and CARP High Availability