vSphere with Kubernetes on VCF 4.0.1 Consolidated Architecture – Part 1

This 4-part blog series is about exploring vSphere with Kubernetes on a VCF 4.0.1 Consolidated Architecture. We will configure the Management WLD of a Consolidated Architecture to support K8S Workload Management, do a walkthrough the Supervisor Management Cluster (TKG Service and CAPI/CAPW), Deploy TKG compute clusters, look at Tier0 considerations for NSX-T and the TKG CLI. Here is the breakdown:

Part 1

Upgrade VCF 4.0.0 to 4.0.1
Management WLD readiness for Workload Management
NSX-T Edge Cluster Architecture Walkthrough
Enable Workload Management using the VCF 4.0.1 Workflow

Part 2

Walkthrough the Supervisor Cluster (TKG Service and CAPI/CAPW)
NSX-T Objects created
NSX-T Tier-0 Considerations
Content Library Subscription
Tenancy Model and Supervisor Namespaces
Storage Classes and CNS-CSI
Accessing CLI tools and deploying a sample Native pod.

Part 3

Deploying TKG compute clusters using the declarative API using the TKG Service for vSphere
NSX-T Objects created
Deploying a sample workload on TKG
Scaling out TKG
Upgrading TKG

Part 4

Using the TKG CLI
Adding Supervisor Cluster as the management cluster in TKG CLI
Generating a TKG Cluster Config
Creating a TKG Cluster using TKG CLI
Scaling out TKG using TKG CLI
Upgrading TKG using TKG CLI

Let’s get started:

Environment details

Current environment is a 4-node Consolidated VCF Stamp running version 4.0.0. This has a vSAN Management WLD cluster which is setup during the SDDC bring up process. All the hosts are L2 uniform. Application Virtual Networks (AVN) is enabled during the bring up process, so that NSX-T 3.0 Tier0 Gateway is deployed and eBGP peering is established with the two Dell EMC 55248-ON Leaf Switches. The infrastructure services (AD, DNS, NTP, DHCP) are hosted outside of the VCF environment.

Since the infra is running on VCF 4.0.0, it needs to be upgraded to 4.0.1 for Workload Management Support.

Upgrading VCF 4.0.0 to 4.0.1

The necessary upgrade bundles are already downloaded. Note that to upgrade from VCF 4.0.0 to 4.0.1, we require an intermediate upgrade to 4.0.0.1.

The upgrade bundles should be now available under the Management WLD. Once we have passed the pre-checks for the WLD, we should be able to install it.

Lets apply update 4.0.0.1 after the prechecks have succeeded.

And now apply upgrade 4.0.1.

There were also some cumulative updates to NSX-T, vCenter and ESXi, post update versions are the below:

VCF – 4.0.1 , NSX-T – 3.0.1 , vCenter – 7.0b , ESXi – 7.0b

Management WLD Readiness

For a Greenfield deployment of 4.0.1, the Management WLD cluster is compliant to enable the Workload Management feature. Since we upgraded from VCF 4.0.0, we need to do a manual step to make it compatible as explained in Cormac Hogan’s post below. Thanks to Cormac Hogan for this tip.

https://cormachogan.com/2020/05/26/vsphere-with-kubernetes-on-vcf-4-0-consolidated-architecture

Also, note that stretched management WLD clusters are unsupported.

Enable trust on the NSX-T Compute manager for authentication.

Add a “WCPReady” tag for the NSX-T Edge Cluster

Now, the Management WLD Cluster should be compatible for K8S Workload Management.

NSX-T 3.0 Edge Cluster Architecture

NSX-T 3.0 Management Cluster is deployed as part of the VCF Bring up workflow. Since we have selected the option to enable AVN (Application Virtual Networks), the workflow also deploys an NSX-T Edge Cluster and a Tier 0 Gateway (Active-Active) with eBGP peering with the Leaf Switches. The consolidated VCF architecture is thus a Shared Management, Compute and Edge Cluster. I have already covered the Edge Cluster architecture in my previous posts, please take a look. Eventhough it was written for a Compute WLD, the same applies for the Management WLD as well.

https://vxplanet.com/2020/04/25/nsx-t-3-0-edge-cluster-automated-deployment-and-architecture-in-vcf-4-0-part-1/

https://vxplanet.com/2020/05/02/nsx-t-3-0-edge-cluster-automated-deployment-and-architecture-in-vcf-4-0-part-2/

Note that for Workload Management use cases, there should be only one Edge Cluster per Overlay Transport Zone (ie, per Supervisor Cluster)

A quick summary:

The architecture is Single-NVDS Multi-TEP on Converged VDS host networking.
As the Converged VDS has host TEP interfaces, both Edge TEP and host TEP are on separate routable VLANs. This is a requirement whenever Edges are deployed on host vSphere c-DVS which has a TEP interface.
Named Teaming Policies are used to achieve deterministic eBGP peering over specific uplinks to the Leaf switches.
A Tier 0 Gateway is instantiated on this Edge cluster and eBGP is used for dynamic route exchange with the Leaf switches (peering over two VLANs) based on the configuration input.

Enabling K8S Workload Management

The workflow to enable K8S Workload Management can be kick started from SDDC Manager which will do a pre-validation to ensure a successful deployment.

We require 4 subnets – two of them should be routable (Overlay)

Pod CIDR – A block is carved out of this defined pool to be used for namespaces (pod networking) as well as for TKG VMs deployed by the TKG Service for vSphere.
Service CIDR – Used for the ClusterIP services. This is implemented as distributed loadbalancers in NSX-T.
Ingress CIDR (Overlay Routable) – A /32 IP is carved out of the defined pool and used as VIP for the L4 loadbalancers for ingress into the KubeAPI and deployed workloads.
Egress CIDR (Overlay Routable) – A /32 IP is carved out of this defined pool and is used for SNAT rules for egress access for pods and TKGs.

Start the validation process for the management WLD cluster

Once succeeded, we can complete the workflow in vSphere console.

We should see the Management WLD cluster as compatible.

The Supervisor Control Plane nodes are deployed in VM form factor. Select the sizing for the control plane nodes. This should not be larger than the vCenter form factor.

Each Supervisor Control plane VM has 2 network interfaces.

eth0 – is VLAN based that is attached to the ESXi Management network. This is where the control plane VM talks to the ESXi workers as well as infrastructure services like vCenter, NSX-T, CSI provisioning, DNS, NTP etc.
eth1 – is attached to an Overlay LS for communication with pods and deployed guest TKGs.

We have to reserve 5 IPs for eth0 from the ESXi management network. 3 goes to each of the Supervisor control plane VMs, one for the Supervisor VIP and the last one is reserved for lifecycle management and repair operations.

Provide the subnet details which was defined earlier.

Note : Ingress and Egress CIDRs are Overlay based and routable. They are NOT VLAN based. This is one of the frequent questions I used to get.

Attach Storage policies for the Control plane VMs, pods ephemeral disks and image cache. This is vSAN by default or any supplemental storage attached to the management WLD.

Click on Finish. This will now enable Kubernetes Control plane on the hypervisor layer.

Once the deployment finishes, we will have a 7 node Kubernetes cluster which is called the Supervisor Cluster.

Master nodes – are 3 X Supervisor Control Plane nodes deployed in VM form factor.
Worker nodes – are the 4 Management WLD ESXi hosts.

This Supervisor Control plane runs the TKG Services (CAPI/CAPW, Virtual Machine Operator) to deploy TKG compute clusters on a Supervisor namespace. It also runs the pod service where we could run workloads as native pods directly on the hypervisor layer in a supervisor namespace. We will go through the Supervisor Cluster in more detail on Part 2.