NSX-T Architecture in vSphere with Tanzu – Part 1 – Per TKG Tier1 vs Per Namespace Tier1

New Year Wishes!!!

Now that we covered the concepts, deployment, basic operations and maintenance of vSphere with Tanzu and TKG clusters in the previous articles, it’s time to walkthrough the NSX-T architecture used in the platform. At the time of writing this article, we have two networking options available for vSphere with Tanzu (since vSphere 7.0U1):

  • vSphere Networking (DVS based) – doesn’t support vSphere pods, no in-house loadbalancers.
  • NSX-T networking

VCF doesn’t support the DVS based vSphere networking, only NSX-T is supported. This blog series and the ones done previously uses NSX-T as the networking stack.

If you have missed the previous articles, you can read them here. Even though it was written using VCF 4.0.1 Consolidated architecture, the concepts and deployment workflow still applies for the vSphere option as well.

Part 1 : https://vxplanet.com/2020/07/04/vsphere-with-kubernetes-on-vcf-4-0-1-consolidated-architecture-part-1/

Part 2 : https://vxplanet.com/2020/06/30/vsphere-with-kubernetes-on-vcf-4-0-1-consolidated-architecture-part-2-supervisor-cluster/

Part 3 : https://vxplanet.com/2020/07/02/vsphere-with-kubernetes-on-vcf-4-0-1-consolidated-architecture-part-3-tkg-compute-clusters/

Part 4 : https://vxplanet.com/2020/07/04/vsphere-with-kubernetes-on-vcf-4-0-1-consolidated-architecture-part-4-tkg-cli/

The NSX-T architecture got a new look when vSphere 7.0 Update1c was released two weeks back. The architecture changed from a per-TKG cluster Tier 1 Gateway model to a per-Supervisor namespace Tier 1 Gateway model. We will cover both architectures in this blog post, and continue the series with architectures for multiple supervisor clusters, Proxy-ARP Gateways and Edge node networking, so there is too much to read 😊 Here is the breakdown:

Part 1Per-TKG cluster Tier 1 vs Per-Supervisor namespace Tier 1 Architecture

Part 2Multi-Supervisor Clusters with Shared Tier 0 and Dedicated Tier 0 Gateway

Part 3Dedicated Tier 1 Edge Clusters

Part 4 – Proxy ARP Gateways

Part 5 – Edge node networking

Let’s get started:

Per-TKG Cluster Tier 1 Gateway model

This was the architecture in use till the release of vSphere 7.0update1c (Dec 17 2020). In this model, a dedicated Tier 1 Gateway is instantiated for each TKG cluster that is spun up on the Supervisor namespace. Irrespective of the Supervisor namespace, each TKG cluster gets a dedicated L4 loadbalancer for ingress access (to KubeAPI and the deployed workloads) and SNAT /32 egress subnet for N-S connectivity.

[Click HERE for HQ image]

prior70u1c

  • Supervisor Control Plane VMs have their eth1 interface attached to a dedicated NSX-T segment (Eg: Segment-1001). Eth0 attaches to the DVS management port group to talk to ESXi worker nodes, vCenter, NSX manager and for cluster heartbeat.
  • Each namespace gets a unique NSX-T segment, not just the Supervisor namespaces. This means that the System namespaces which has the control plane pods residing on the Supervisor Control Plane VMs also gets a dedicated NSX-T segment. (Eg: Segment 1002-1020)
  • All these segments (1001-1020) attach to a Tier1 Gateway called the Cluster T1 Gateway. In the figure it is labelled as “Tier-1 Gateway for the supervisor cluster”.

120

  • If Integrated Harbor registry is enabled, the harbor pods are deployed as native pods on the System Registry Supervisor namespace. A dedicated logical Segment is created for the System registry namespace and attaches to the above said Supervisor Cluster Tier 1 Gateway.
  • The load balancer provisioned on this Cluster Tier1 Gateway provides Ingress access to the Supervisor cluster kubeAPI, Harbor UI and the applications deployed as native pods on all the supervisor namespaces. All services of type “loadbalancer” is realized as a L4 VS and services of type “Ingress” is realized as an L7 VS on this loadbalancer. This can also be seen as a Shared Tier-1 Gateway.
  • Creating subsequent Supervisor namespaces also means newer logical segments getting attached to the Shared Supervisor cluster Tier 1 Gateway.
  • Each namespace gets a dedicated SNAT /32 subnet for egress. This includes the system namespaces as well. Mapping SNAP IP to namespaces is good for trackability purposes.
  • Each TKG cluster, irrespective of the Supervisor namespace gets a dedicated Tier1 gateway. The loadbalancer provisioned on this T1 gateway provides the ingress access to kubeAPI and the deployed workloads in the TKG cluster.
  • Creating subsequent TKG clusters means newer logical segments and Tier 1 Gateways
  • Each TKG Cluster gets a dedicated SNAT /32 subnet for north-south egress.
  • There is not SNAT for east-west communication.

For more information on the NSX-T objects in this architecture, please read through my previous articles as those were based on the per-TKG cluster Tier 1 architecture.

Part 2 : https://vxplanet.com/2020/06/30/vsphere-with-kubernetes-on-vcf-4-0-1-consolidated-architecture-part-2-supervisor-cluster/

Part 3 : https://vxplanet.com/2020/07/02/vsphere-with-kubernetes-on-vcf-4-0-1-consolidated-architecture-part-3-tkg-compute-clusters/

Here are the takeaways in this architecture:

  • Supervisor namespace is the unit of multitenancy in vSphere with Tanzu. Per-TKG Cluster Tier 1 NSX-T architecture doesn’t map well to the namespace based multitenancy in terms of NSX-T tenant boundaries as the namespace objects are scattered across different Tier-1 gateways. For eg: Native pods on a Supervisor namespace attaches to the Cluster Tier 1 Gateway but the TKG clusters on the same Supervisor namespace attaches to a different dedicated Tier 1 Gateway.
  • Within a namespace tenant, possibility of workloads having different ingress/egress points. For eg: native pods on all Supervisor namespaces (tenants) has ingress/egress via the cluster Tier 1 Gateway but the pods on TKG clusters has ingress/egress via their own dedicated T1 Gateway.
  • All system namespaces has a dedicated segment and SNAT rules which could be simplified..
  • This architecture has more usage on egress SNAT pools.
  • There is no 1:1 mapping between Namespace tenants and SNAP egress IPs. This is because the TKG clusters deployed in the Supervisor namespace has their own dedicated SNAT IPs.

This architecture is replaced with per-Namespace Tier 1 gateway model since vSphere version 7.0 Update1c. To upgrade the topology from a per-TKG Cluster Tier 1 model to a per-namespace Tier1 model, we have to upgrade NSX-T, vCenter Server, and all vSphere with Tanzu components . The workflow also upgrades the NSX Container Plugin (NCP) which in turn migrates the topology to the per-Supervisor namespace Tier 1 model.

Per- Supervisor Namespace Tier 1 Gateway model

This architecture was introduced since version 7.0 update1c. In this model, a dedicated Tier 1 Gateway is instantiated for each supervisor namespace that is created on the supervisor cluster. Each TKG cluster deployed in the supervisor namespace shares the dedicated Tier-1 gateway of the namespace. All ingress to the TKG clusters (KubeAPI and workloads) and native pods happens through the loadbalancer on this dedicated Tier 1 Gateway. All TKG Clusters and native pods share the same SNAP IP for egress for N-S connectivity.

[Click HERE for HQ Image]

70U1c

  • Supervisor Control Plane VMs have their eth1 interface attached to a dedicated NSX-T segment (Eg: Segment-1001). Eth0 attaches to the DVS management port group to talk to ESXi worker nodes, vCenter, NSX manager and cluster heartbeat.
  • All the control plane pods in the system namespaces use “hostNetwork” as the “podNetwork”. The pods share the host networking stack and get the same IP address as that of the host (Supervisor control plane VM) on which they are residing. All the system namespaces are grouped together as a shared system resource. Since the Supervisor Control Plane VMs are already attached to an NSX-T Segment (eth1), we wont require additional NSX-T segments for the system namespaces. This reduces the number of NSX-T objects and simplifies the topology.

108

109

107

  • This logical segment where the Supervisor VMs are attached is connected to a Tier1 Gateway called the cluster T1 Gateway. In the figure it is labelled as “Tier-1 Gateway for the supervisor cluster”. Other Supervisor namespace segments wont connect to this Cluster Tier 1 Gateway.
  • Each supervisor namespace gets a dedicated logical segment which attaches to a dedicated Tier-1 gateway, called Namespace Tier 1 gateway.

130

  • For the Integrated Harbor registry, a dedicated logical Segment is created and attaches to a dedicated Tier 1 Gateway. Ingress access to Harbor UI is via the loadbalancer VS provisioned on this Tier 1 Gateway.

101

102

  • The load balancer provisioned on the Namespace Tier1 Gateway provides Ingress access to the TKG cluster KubeAPI and the applications deployed in the TKG cluster namespaces. All services of type “Loadbalancer” is realized as an L4 VS on the loadbalancer via the Cloud provider component which uses the NSX plugin on the Supervisor namespace. Services of type “Ingress” require Nginx or Contour at present.

103

104

  • Creating subsequent Supervisor namespaces means newer logical segments and newer Tier 1 gateways.
  • Each namespace gets a dedicated SNAT subnet for egress. This egress IP is shared by all the TKG clusters deployed within the namespace. There is a 1:1 mapping between a namespace tenant and SNAP egress IP.
  • There is not SNAT for east-west communication.

105

  • TKG clusters won’t get a dedicated Tier 1 Gateway in this architecture, instead it shares the resources available within the Supervisor namespace.

Here are the takeaways in this architecture:

  • This architecture has an improved tenant mapping between Supervisor cluster namespaces with the logical networking (Tier-1 Gateway in NSX-T)
  • We have consistency in per-namespace tenant based ingress/egress traffic directions. All ingress/egress to/from Supervisor namespace workloads (native pods as well as TKGs) happens through their dedicated namespace Tier 1 Gateways.
  • System namespaces uses the Supervisor cluster VM “hostNetwork” as the “podNetwork” and doesn’t have dedicated logical segments and SNAT rules which rduces the number of NSX-T objects and simplifies the topology.
  • There is a 1:1 mapping between the Supervisor namespace and SNAT egress IP. This gives simplified options for policy-based-routing on the physical switch fabrics including blacklisting/whitelisting tenants.

Upgrading the topology

To upgrade the topology from a per-TKG Cluster Tier 1 model to a per-Supervisor namespace Tier1 model, we have to upgrade NSX-T, vCenter Server, and all vSphere with Tanzu components. At a high level, this is the workflow:

  • Upgrade NSX-T 3.0 to NSX-T 3.1
  • Upgrade vCenter Server from 7.0 update1 to 7.0 update1c
  • Upgrade ESXi hosts on the supervisor cluster to 7.0 update1c
  • Perform Supervisor namespace update
  • Upgrade TKG clusters

When we upgrade the Supervisor Cluster to version 7.0 update1c, the NSX Container Plug-in (NCP) is also upgraded which in turn migrates the networking topology to the per-namespace Tier1 model.

More guidance on the upgrade procedure is available in the VMware official documentation : https://docs.vmware.com/en/VMware-vSphere/7.0/vmware-vsphere-with-tanzu/GUID-BEAF45D2-9ABA-46CA-ABE8-52A6B94AF085.html

Time to wrap up!!! I hope this article was informative. Will see you in Part 2.

Thanks for reading.

Continue reading? Here are the other parts of this series:

Part 2 : https://vxplanet.com/2021/01/05/nsx-t-architecture-in-vsphere-with-tanzu-part-2-multisupervisor-shared-t0-vs-dedicated-t0/

Part 3 : https://vxplanet.com/2021/02/04/nsx-t-architecture-in-vsphere-with-tanzu-part-3-dedicated-tier-1-edge-clusters/

Part 4 : https://vxplanet.com/2021/02/12/nsx-t-architecture-in-vsphere-with-tanzu-part-4-proxy-arp-gateways/

Part 5 : https://vxplanet.com/2021/03/03/nsx-t-architecture-in-vsphere-with-tanzu-part-5-edge-node-networking/

 

2020-05-03 20_06_59-Photos

2 thoughts on “NSX-T Architecture in vSphere with Tanzu – Part 1 – Per TKG Tier1 vs Per Namespace Tier1

  1. If you are doing this via VCF, I’m assuming you end up in the new model if you’re using VCF 4.2 as that contains NSX-T 3.1, vCenter 7.0U1c, and ESXi 7.0U1d. Is that accurate?

Leave a Reply