Upgrade Tanzu Kubernetes Grid Multicloud 1.1.3 to 1.2
This is just a quick example of upgrading Tanzu Kubernetes Grid (multicloud) 1.1.3 to 1.2. In this example, TKGm is running on vSphere.
For those not familiar, TKG:
provides a consistent, upstream-compatible implementation of Kubernetes, that is tested, signed, and supported by VMware. Tanzu Kubernetes Grid is central to many of the offerings in the VMware Tanzu portfolio.
TKGm, as I call it, can be deployed into various public clouds, more all the time, and provides the same Kubernetes no matter where it is deployed. 1.2 supports vSphere, Azure, and AWS as host infrastructure, and more will be added over time.
What’s new in 1.2?
- Moving from a separate loadbalancer to kube-vip
- New default CNI: Antrea
- Addition of Harbor as a shared service
- Backup and restore management clusters with Velero
And more!
Upgrade from 1.1.3 to 1.2 (on vSphere)
The documentation for this process is great, and I’m mostly just repeating what it shows. Best to follow those docs, but sometimes having an example is nice.
Initially I have the tkg 1.1.3 CLI.
$ tkg version
Client:
Version: v1.1.3
Git commit: 0e8e58f3363a1d4b4063b9641f44a3172f6ff406
I’m just running one management and one workload cluster.
$ tkg get cluster --include-management-cluster
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES
my-cluster default running 1/1 2/2 v1.18.6+vmware.1 <none>
tkg-mgmt tkg-system running 1/1 1/1 v1.18.6+vmware.1 <none>
So the first step is to download the new 1.2 CLI as well as three OVAs. These artifacts are all downloaded from VMware.
-
Install the new CLI first.
-
Next, upload the three OVAs into vSphere and mark them as templates.
Kubernetes v1.19.1: Photon v3 Kubernetes v1.19.1 OVA
Kubernetes v1.18.8: Photon v3 Kubernetes v1.18.8 OVA
Kubernetes v1.17.11: Photon v3 Kubernetes v1.17.11 OVA
First we upgrade the management cluster.
- I’m conservative so I copied the old config files first.
$ cp -rp ~/.tkg/ ~/.tkg-pre-1.2-upgrade
- List the management cluster.
$ tkg get management-cluster
It seems that the TKG settings on this system are out-of-date. Proceeding on this command will cause them to be backed up and overwritten by the latest settings.
Do you want to continue? [y/N]: y
the old providers folder /home/ubuntu/.tkg/providers is backed up to /home/ubuntu/.tkg/providers-20201102220133-xryjaxet
The old bom folder /home/ubuntu/.tkg/bom is backed up to /home/ubuntu/.tkg/bom-20201102220133-sk8je1f4
MANAGEMENT-CLUSTER-NAME CONTEXT-NAME STATUS
tkg-mgmt * tkg-mgmt-admin@tkg-mgmt Success
- Make sure to be using the mgmt cluster context.
$ kubectl config use-context tkg-mgmt-admin@tkg-mgmt
Switched to context "tkg-mgmt-admin@tkg-mgmt".
- Add labels (new in 1.2):
$ kubectl label -n tkg-system cluster.cluster.x-k8s.io/tkg-mgmt cluster-role.tkg.tanzu.vmware.com/management="" --overwrite=true
cluster.cluster.x-k8s.io/tkg-mgmt labeled
- Run the upgrade of the mgmt cluster.
$ tkg upgrade management-cluster tkg-mgmt
SNIP!
Patching MachineDeployment with the kubernetes version v1.19.1+vmware.2...
Waiting for kubernetes version to be updated for worker nodes...
Management cluster 'tkg-mgmt' successfully upgraded to TKG version 'v1.2.0' with kubernetes version 'v1.19.1+vmware.2'
- Now the mgmt cluster has been upgraded.
$ tkg get cluster --include-management-cluster
NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES
my-cluster default running 1/1 2/2 v1.18.6+vmware.1 <none>
tkg-mgmt tkg-system running 1/1 1/1 v1.19.1+vmware.2 management
- Can list the k8s versions. Note 1.19! Nice.
$ tkg get kubernetesversions
VERSIONS
v1.17.11+vmware.1
v1.17.3+vmware.2
v1.17.6+vmware.1
v1.17.9+vmware.1
v1.18.2+vmware.1
v1.18.3+vmware.1
v1.18.6+vmware.1
v1.18.8+vmware.1
v1.19.1+vmware.2
- Finally, upgrade the workload cluster.
$ tkg upgrade cluster my-cluster
Logs of the command execution can also be found at: /tmp/tkg-20201102T223108680260342.log
Upgrading workload cluster 'my-cluster' to kubernetes version 'v1.19.1+vmware.2'. Are you sure? [y/N]: y
Validating configuration...
Verifying kubernetes version...
Retrieving configuration for upgrade cluster...
Create InfrastructureTemplate for upgrade...
Upgrading control plane nodes...
Patching KubeadmControlPlane with the kubernetes version v1.19.1+vmware.2...
Waiting for kubernetes version to be updated for control plane nodes
Upgrading worker nodes...
Patching MachineDeployment with the kubernetes version v1.19.1+vmware.2...
Waiting for kubernetes version to be updated for worker nodes...
Cluster 'my-cluster' successfully upgraded to kubernetes version 'v1.19.1+vmware.2'
Well, that was pretty easy.
OPTIONAL: Create a New Cluster
Here I create a new cluster. Note the use of the --vsphere-controlplane-endpoint-ip
which is available in 1.2 so you can set the k8s API IP address and presumably pre-set the DNS entry for your end users.
tkg create cluster --plan dev 1-2-cluster --vsphere-controlplane-endpoint-ip 10.0.6.10
Logs of the command execution can also be found at: /tmp/tkg-20201103T173324591159647.log
Validating configuration...
Creating workload cluster '1-2-cluster'...
Waiting for cluster to be initialized...
Waiting for cluster nodes to be available...
Waiting for addons installation...
Workload cluster '1-2-cluster' created
That’s it!
VSPHERE_TEMPLATE Issue
Due to the way I had originally deployed TKGm, I had set the VSPHERE_TEMPLATE
in the TKG config file and to deploy a new 1.2 cluster I needed to comment that out. Most users won’t have this set AFAIK, and it’s a simple config file change.
$ tkg create cluster --plan dev 1-2-cluster --vsphere-controlplane-endpoint-ip 10.0.6.10
Logs of the command execution can also be found at: /tmp/tkg-20201103T172950440339415.log
Validating configuration...
Error: : workload cluster configuration validation failed: vSphere config validation failed: vSphere template kubernetes version validation failed: unable to get or validate VSPHERE_TEMPLATE for given k8s version: incorrect VSPHERE_TEMPLATE (/Datacenter/vm/photon-3-kube-v1.18.6_vmware.1) specified for Kubernetes version (v1.19.1+vmware.2). TKG CLI will autodetect the correct VM template to use, so VSPHERE_TEMPLATE should be removed unless required to disambiguate among multiple matching templates
Detailed log about the failure can be found at: /tmp/tkg-20201103T172950440339415.log
Commented out the option:
$ grep -i template ~/.tkg/config.yaml
VSPHERE_HAPROXY_TEMPLATE: /Datacenter/vm/photon-3-haproxy-v1.2.4-vmware.1
# VSPHERE_TEMPLATE will be autodetected based on the kubernetes version. Please use VSPHERE_TEMPLATE only to override this behavior
#VSPHERE_TEMPLATE: /Datacenter/vm/photon-3-kube-v1.18.6_vmware.1