Automating a Kubernetes Cluster on VMware vSphere with Scripts and Terraform (v1.31)

 


Automating a Kubernetes Cluster on VMware vSphere with Scripts and Terraform (v1.31)

Introduction

In this post, we’ll walk through how to automate the deployment of a Kubernetes v1.31 cluster in a VMware vSphere environment using scripts, PowerCLI, and Terraform.
This approach follows both VMware and Kubernetes best practices—focusing on automation, consistent configuration, and scalability for lab and production environments.

Prerequisites

Before starting, ensure you have the following:

  • A functioning vSphere environment (vCenter + ESXi)

  • A Linux VM template with VMware Tools installed (Ubuntu 22.04 LTS is recommended)

  • Terraform and the vSphere provider configured

  • Access to either PowerCLI or govc for API automation

  • Internet access for your Kubernetes nodes


1. Preparing the VM Template

For best results, create a lightweight Ubuntu 22.04 template with cloud-init installed and SSH enabled.

sudo apt update && sudo apt install -y cloud-init open-vm-tools sudo systemctl enable cloud-init

Shut down the VM and convert it to a vSphere template. This image will be cloned for all control-plane and worker nodes.


2. Automating Deployment with govc and PowerCLI

We’ll use two automation paths—govc for Linux/macOS and PowerCLI for Windows.

govc Script Example (Multi-user Input)

#!/bin/bash export GOVC_URL='vcenter.example.local' export GOVC_USERNAME='admin@vsphere.local' export GOVC_PASSWORD='yourpassword' export GOVC_INSECURE=1 read -p "Cluster name: " CLUSTER read -p "Datastore: " DATASTORE read -p "Network: " NETWORK read -p "Template name: " TEMPLATE read -p "Number of workers: " WORKERS govc vm.clone -vm "$TEMPLATE" -folder "K8s" -on=false -net "$NETWORK" -datastore "$DATASTORE" "${CLUSTER}-control" govc vm.clone -vm "$TEMPLATE" -folder "K8s" -on=false -net "$NETWORK" -datastore "$DATASTORE" -count="$WORKERS" "${CLUSTER}-worker"

This simple script can be expanded to inject metadata and attach ISO images generated by Terraform later.

PowerCLI Version

Connect-VIServer -Server vcenter.example.local -User admin@vsphere.local -Password 'yourpassword' $clusterName = Read-Host "Enter Cluster Name" $datastore = Read-Host "Enter Datastore" $network = Read-Host "Enter Network" $template = Read-Host "Enter Template" $workers = Read-Host "Number of Workers" New-VM -Name "$clusterName-control" -Template $template -Datastore $datastore -NetworkName $network 1..$workers | ForEach-Object { New-VM -Name "$clusterName-worker$_" -Template $template -Datastore $datastore -NetworkName $network }

Both scripts follow vSphere best practices by separating control-plane and worker creation logic for flexibility and scalability.


3. Cloud-Init User Data File

Below is a full, copy-pasteable user-data file for cloud-init automation of Kubernetes 1.31.
It handles initialization on the control-plane and join logic for workers automatically.

#cloud-config package_update: true package_upgrade: true packages: - docker.io - apt-transport-https - curl - kubelet - kubeadm - kubectl runcmd: - systemctl enable docker - systemctl start docker - kubeadm config images pull - if [ "$(hostname)" = "k8s-control" ]; then kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version=1.31.0; mkdir -p /home/ubuntu/.kube; cp -i /etc/kubernetes/admin.conf /home/ubuntu/.kube/config; chown ubuntu:ubuntu /home/ubuntu/.kube/config; kubeadm token create --print-join-command > /home/ubuntu/join.sh; fi - if [[ "$(hostname)" =~ worker ]]; then bash /home/ubuntu/join.sh; fi

This approach initializes the control plane, configures kubeconfig for the ubuntu user, and dynamically joins workers.


4. Terraform Module for vSphere + Cloud-Init

Here’s an example Terraform module to deploy the nodes, attach cloud-init ISOs, and start the cluster.

main.tf

provider "vsphere" { user = var.vsphere_user password = var.vsphere_password vsphere_server = var.vsphere_server allow_unverified_ssl = true } data "vsphere_datacenter" "dc" { name = var.vsphere_datacenter } data "vsphere_datastore" "datastore" { name = var.vsphere_datastore datacenter_id = data.vsphere_datacenter.dc.id } data "vsphere_network" "network" { name = var.vsphere_network datacenter_id = data.vsphere_datacenter.dc.id } data "vsphere_virtual_machine" "template" { name = var.vm_template datacenter_id = data.vsphere_datacenter.dc.id } resource "vsphere_virtual_machine" "control" { name = "k8s-control" resource_pool_id = data.vsphere_datacenter.dc.id datastore_id = data.vsphere_datastore.datastore.id num_cpus = 2 memory = 4096 guest_id = data.vsphere_virtual_machine.template.guest_id scsi_type = data.vsphere_virtual_machine.template.scsi_type clone { template_uuid = data.vsphere_virtual_machine.template.id } cdrom { datastore_id = data.vsphere_datastore.datastore.id path = "cloud-init.iso" } network_interface { network_id = data.vsphere_network.network.id } }

You can duplicate the above block for workers, changing the VM name and compute parameters.
Terraform handles provisioning and attaches the generated cloud-init ISO, which boots the VMs directly into cluster initialization.


5. Post-Deployment Configuration

After the nodes finish booting:

  1. Validate the cluster:

    kubectl get nodes
  2. Install a CNI plugin (Flannel example):

    kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
  3. Install the vSphere CSI driver for dynamic storage provisioning:

    kubectl apply -f https://github.com/kubernetes-sigs/vsphere-csi-driver/releases/latest/download/vsphere-csi-driver.yaml

6. Best Practices Recap

  • Use separate networks for management, storage, and workload traffic.

  • Keep your control-plane nodes highly available.

  • Backup your etcd regularly.

  • Use the vSphere CSI and CPI integrations for full VMware compatibility.

  • Automate cluster updates and rotate tokens periodically.


7. Further Reading


Author’s Note:
This article is part of my ongoing Virtology series on virtualization and automation.

Comments

Popular posts from this blog

Building a Secure Virtual OPNsense 26.1 Firewall with VLANs, DMZ, and CARP High Availability

Proxmox VE + full Kubernetes (kubeadm) step-by-step

Monitoring Virtualized Environments with Graylog: A Complete Guide