Introduction
This document outlines the deployment of a Kubernetes v1.31.9 HA cluster on Ubuntu 24.04 using Kubespray v2.27.1.
Objective: Establish a robust lab environment with minimum requirements. It can be upgraded to production later by simply adding more worker nodes and migrating to dedicated master nodes. Topology: 3-node HA architecture (Control Plane, Etcd, and Worker roles on all nodes).
Installation with Ansible
Prerequisite
Required: ssh to others machine with ssh key defined in ssh config, no password!
# Run in jump host
# add-apt-repository --yes --update ppa:ansible/ansible
# apt install ansible -y
# Fuck this shit.
# [ERROR]: Task failed: Action failed: Ansible must be between 2.16.4 and 2.17.0 exclusive - you have 2.19.5
apt update && apt install python3-venv python3-pip -y
Host information: All using Ubuntu 24.04
# Jump host that can ssh to thoses vm
172.25.110.137 kienlt-lab-machine-1
172.25.110.16 kienlt-lab-machine-2
172.25.110.235 kienlt-lab-machine-3
172.25.110.76 kienlt-lab-utilities
Installation step by step:
# Run in jump host
# Clone && Init
git clone https://github.com/kubernetes-sigs/kubespray.git
cd kubespray/ && git checkout v2.27.1 # Releate date: Jul 28 2025
cp -rfp inventory/sample inventory/kienlt-cluster
# Install venv for ansible
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Edit file inventory/kienlt-cluster/inventory.ini
[all]
node1 ansible_host=172.25.110.137 ip=172.25.110.137
node2 ansible_host=172.25.110.16 ip=172.25.110.16
node3 ansible_host=172.25.110.235 ip=172.25.110.235
[kube_control_plane]
node1
node2
node3
[etcd]
node1
node2
node3
[kube_node]
node1
node2
node3
[k8s_cluster:children]
kube_control_plane
kube_node
After that. Go for ping test. I'm using root user so no need --become
ansible all -i inventory/kienlt-cluster/inventory.ini -m ping
Check version:
cat inventory/kienlt-cluster/group_vars/k8s_cluster/k8s-cluster.yml |grep -E "kube_version|kube_network_plugin"
kube_version: v1.31.9
kube_network_plugin: calico
kube_network_plugin_multus: false
CNI - Cilium configuration
Ok, for this lab I would replace Calico with Cilium. So change kube_network_plugin calico to kube_network_plugin: cilium. Then everything will use default, I don't have any problem with kube_service_address or kube_pods_subnet since this is lab xD
Also enable cilium_kube_proxy_replacement with strict and others here: inventory/kienlt-cluster/group_vars/k8s_cluster/k8s-net-cilium.yml
cilium_kube_proxy_replacement: strict
cilium_hubble_install: true
cilium_enable_hubble_ui: "{{ cilium_enable_hubble }}"
Cluster Bootstrap
Run this, take a coffee and wait for it finish. Estimate about 15-20 minutes.
ansible-playbook -i inventory/kienlt-cluster/inventory.ini cluster.yml
If everything working as expected xD
PLAY RECAP ***********************************************************************************************************************************************************************************************************
node1 : ok=683 changed=137 unreachable=0 failed=0 skipped=1122 rescued=0 ignored=3
node2 : ok=638 changed=129 unreachable=0 failed=0 skipped=1007 rescued=0 ignored=3
node3 : ok=640 changed=129 unreachable=0 failed=0 skipped=1005 rescued=0 ignored=3
Sunday 21 December 2025 14:01:09 +0700 (0:00:00.089) 0:11:59.291 *******
Setup kubectl in jump host
Kubeconfig setup
mkdir -p $HOME/.kube
# Get file from node1 to Jump Host
ssh root@kienlt-lab-machine-1 "cat /etc/kubernetes/admin.conf" > $HOME/.kube/config
chmod 600 $HOME/.kube/config
sed -i 's/127.0.0.1/kienlt-lab-machine-1/g' $HOME/.kube/config
Network check: nc -vz kienlt-lab-machine-1 6443
Kubectl setup:
curl -LO "https://dl.k8s.io/release/v1.31.0/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
# Check
kubectl cluster-info
Expected output from kubectl cluster-info xD
Kubernetes control plane is running at https://kienlt-lab-machine-1:6443
Setup kubectl alias and auto-completeion
echo 'alias k="kubectl"' >> ~/.bashrc
echo 'source <(kubectl completion bash)' >> ~/.bashrc
echo 'complete -o default -F __start_kubectl k' >> ~/.bashrc
source ~/.bashrc
Testing time
k get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready control-plane 13m v1.31.9
node2 Ready control-plane 13m v1.31.9
node3 Ready control-plane 13m v1.31.9
Install K9S: The best CLI UI for K8s
curl -sS https://webinstall.dev/k9s | bash
sudo ln -s ~/.local/bin/k9s /usr/local/bin/k9s
k9s version
Upgrade cluster with Kubespray (Optional for lab xD)
Don't upgrade unless you have some experience! It took 2 hours for debug and rollback in my case! xD
From what I'm understanding, need to check out to new branch/tag that have support to version we want. Example I want to upgrade from v1.31.9 to v1.32.0, current tag is v2.27.1. List all 2.2x tag by
git tag -l "v2.2*"|tail -n 10
v2.24.3
v2.25.0
v2.25.1
v2.26.0
v2.27.0
v2.27.1
v2.28.0
v2.28.1
v2.29.0
v2.29.1
So we should pick version v2.28.1 of kubespray for testing upgrade cluster. And we wanna know how many versions it supports? (It supports dynamic version right now!)
git checkout v2.28.1
cat roles/kubespray_defaults/vars/main/checksums.yml|grep kube -A3
kubelet_checksums:
arm64:
1.32.8: sha256:d5527714fac08eac4c1ddcbd8a3c6db35f3acd335d43360219d733273b672cce
1.32.7: sha256:b862a8d550875924c8abed6c15ba22564f7e232c239aa6a2e88caf069a0ab548
--
kubectl_checksums:
arm:
1.32.8: sha256:ed54b52631fdf5ecc4ddb12c47df481f84b5890683beaeaa55dc84e43d2cd023
1.32.7: sha256:c5416b59afdf897c4fbf08867c8a32b635f83f26e40980d38233fad6b345e37c
--
kubeadm_checksums:
arm64:
1.32.8: sha256:8dbd3fa2d94335d763b983caaf2798caae2d4183f6a95ebff28289f2e86edf68
1.32.7: sha256:a2aad7f7b320c3c847dea84c08e977ba8b5c84d4b7102b46ffd09d41af6c4b51
But for testing upgrade, we only upgrade to version v1.32.0. Change it in inventory/kienlt-cluster/group_vars/k8s_cluster/k8s-cluster.yml.
# Remember, no "v". Only semantic version!
kube_version: 1.32.0
Active env and install dependencies for new version
source venv/bin/activate
# Make sure ansible-core compatible with v2.28.1 kubespray
pip install -r requirements.txt
After that, let's rock!
ansible-playbook -i inventory/kienlt-cluster/inventory.ini upgrade-cluster.yml
Expected output with some issue with Cilium. But I don't want to investigate too much time for this issue. Because Cilium still working xD!!!
In scenario, you still want a clean cluster
For reset: git checkout v2.27.1
Change cilium_kube_proxy_replacement: "true" from cilium_kube_proxy_replacement: strict
Change kube_version: v1.31.9
ansible-playbook -i inventory/kienlt-cluster/inventory.ini reset.yml
ansible all -i inventory/kienlt-cluster/inventory.ini -m shell -a "rm -rf /etc/cni /opt/cni /var/lib/etcd /var/lib/kubelet /etc/kubernetes /etc/cni/net.d"
Install again
ansible-playbook -i inventory/kienlt-cluster/inventory.ini cluster.yml
ssh root@kienlt-lab-machine-1 "cat /etc/kubernetes/admin.conf" > $HOME/.kube/config
chmod 600 $HOME/.kube/config
sed -i 's/127.0.0.1/kienlt-lab-machine-1/g' $HOME/.kube/config
k get nodes