NSX-T Tier1 SR Placement and the effect on Northbound ECMP – Part 3

This is the third and final part of the blog series on NSX-T Tier 1 Gateway SR Construct placement and it’s influence on North-South routing and ECMP. In case you missed the previous parts, you can find them here:

Part 1 -> https://vxplanet.com/2019/10/26/nsx-t-tier1-sr-placement-and-the-effect-on-northbound-ecmp-part-1/

Part 2 -> https://vxplanet.com/2019/10/28/nsx-t-tier1-sr-placement-and-the-effect-on-northbound-ecmp-part-2/

Before we proceed, I would like to thank Oliver Ziltener for giving clarity on ECMP behavior with different topologies as well as spending time to review couple of my articles and giving feedback. 

Let’s revisit some of the observations we had from the two Scenarios on Parts 1 & 2. 

Part 1 dealt with the Scenario where T1 Gateway didn’t have an SR Construct. We noticed that:

  • Tier 0 DR-SR ECMP was achieved on the Compute Transport Nodes for Segments attached to the T1 Gateway
  • Both Tier 0 SR Constructs were leveraged for Northbound routing
  • All available Tier 0 eBGP Uplink ECMP was utilized for North bound routing 
  • As Tier 0 SR Constructs were scaled out (upto 8), more ECMP paths become available for the segments connected to the Tier 1 Gateway

Part 2 dealt with the Scenario where T1 Gateway had an SR Construct leveraging a shared T0-T1 Edge Cluster. We noticed that:

  • Tier 0 DR-SR ECMP was NOT available on the Compute Transport Nodes for Segments attached to the T1 Gateway. This is because the traffic has already been routed to the Edge node for T1 SR lookup and thereafter all upstream routing happened locally on the Edge node (holding the Active T1 SR Construct).
  • Both Tier 0 SR Constructs were NOT leveraged for Northbound routing. Only the Edge node hosting the Active T1 SR Construct was involved in Northbound routing.
  • All available Tier 0 eBGP Uplink ECMP were NOT utilized for North bound routing. ECMP for eBGP Uplinks are only utilized from the Active T1 SR Edge node. The Passive T1 SR Edge node was not utilized.
  • As Tier 0 SR Constructs were scaled out (upto 8), the available ECMP paths Northbound remained the same. Any additional ECMP paths were NOT available for the Segments in the T1 Gateway

In this article we will focus on Scenario 3 – Tier 1 Gateway with an SR Construct on a Dedicated Edge Cluster. As in previous articles, we will not cover the effect on East-West routing here, and will deal that as a separate one. Let’s get started

Tier 1 Gateway with an SR Construct on Dedicated Edge Cluster: North-South Routing and ECMP

The below sketch shows the traffic pattern for North bound flow from a Segment connected to the T1 Gateway having an SR Construct on the dedicated Edge nodes.

[Click here for HQ Image] Sketch1

20

And the below sketch shows the traffic pattern for a South bound flow from the Leaf Switches to the Segment connected to the same T1 Gateway.

[Click here for HQ Image] Sketch2

21

As discussed previously, note that routing happens closer to the source of traffic. For Northbound, it would be the Compute Transport nodes and for Southbound it would be the Edge nodes. As such, the return path for a flow can be on a different route compared to the source flow.

Topology Details

The Logical Topology is same as Part 2 except that the SR Construct of the Tier 1 Gateway is deployed on a dedicated Edge Cluster. This Edge Cluster is configured only for the Overlay Transport Zone. All egress to outside world goes through the separate Tier 0 Edge Cluster.

  • Tier0 Gateway is deployed in Active-Active mode with 4 uplinks on an Edge Cluster with 2 Edge nodes.
  • Each Edge node hosts two uplinks – one on VLAN 60 and the other on VLAN 70
  • The T0 Gateway establishes eBGP peering with two Dell EMC S 5048-ON Leaf switches. We will have 4 eBGP peerings in total
  • T0 Gateway uses eBGP ECMP on the uplinks and Inter-SR Routing between the SR nodes.
  • A single Tier 1 Gateway deployed with an SR Construct and uplinked to the Tier 0 Gateway. The SR Construct leverages a dedicated Edge cluster.
  • A Logical Segment on 172.21.0.0/24 attached to the Tier 1 Gateway

Northbound Traffic behavior

As shown in Sketch1, the Tier 1 DR Construct has two interfaces – One attached to the Logical Segment (over auto-plumbed VNI 71701) and the other attached to it’s SR Construct sitting on the Edge Node (over auto-plumbed VNI 71707). This T1 DR Construct next-hops to it’s SR Construct on the dedicated Edge cluster.

Traffic from the Compute Segments (172.21.0.0/24) attached to the T1 Gateway will be tunneled to the Edge node hosting the Active T1 SR Construct for Northbound reachability. This happens after the T1 DR Lookup on the Compute Transport Nodes and another T1 DR lookup is avoided at the Edge node (hosting Active T1 SR).

Let’s see the Next-hop of T1 DR and T1 SR Constructs.

This is the Forwarding table of the T1 DR which is next-hopping to it’s Active SR Construct on the Dedicated Edge node.

2

This is the Forwarding table of the T1 SR next-hopping to Tier 0 DR Construct. This T0 DR is available locally on the Edge node, so the lookup happens locally and doesn’t need to leave the Edge node for northbound reachability.

3

The T0 SR Construct sits on a separate T0 Edge Cluster, so northbound traffic needs to tunnel to the T0 Edge nodes for T0 SR Lookup. The T0 DR on the dedicated T1 Edge node has two Default Routes that points to the two T0 SR Constructs on the T0 Edge nodes to achieve T0 DR & SR ECMP between the T1 & T0 Edge nodes . In this way, ECMP is achieved by introducing a dedicated T1 Edge Cluster into the design.

This T0 DR-SR ECMP is scalable. If we introduce additional T0 SR Constructs on the T0 Gateway, the T0 DR forwarding table on the dedicated T1 Edge cluster is updated with default routes to the new T0 SR Constructs, thereby increasing the number of ECMP paths from the T1 Edge cluster.

This is the Forwarding table of the T0 DR Construct on the Tier 1 Edge Cluster. The Next-hops are the T0 SR Constructs on separate Edge cluster.

4

From the T0 SR Construct, egress traffic is routed via its eBGP uplinks leveraging eBGP ECMP. 

5.png

Overall, we get a total of 2 X 2 =4 ECMP paths for northbound routing in this topology but with the addition of a dedicated T1 Edge Cluster.

Let’s do a Traceflow to see the Flows in action:

Our source is a VM attached to the Logical Segment (172.21.0.0/24) on the T1 Gateway. Destination is a machine outside the NSX-T Environment. T1 Edge Cluster has 2 Edge nodes – bggwedge03 & bggwedge04. The Active T1 SR Construct is on bggwedge03. T0 Edge Cluster has 2 Edge nodes – bggwedge01 & bggwedge02

6

As you can see, once after the T1 DR lookup is completed on the Compute Transport node (esx02.orange.local), the traffic has been tunneled to the Active Edge node (bggwedge03) on the dedicated T1 Edge cluster. T1 SR & T0 DR lookup happens locally on this edge node bggwedge03 which is then tunneled to T0 Edge Cluster for T0 SR Lookup. Traffic gets egressed to external networks via it’s eBGP ECMP uplinks. 

To demonstrate T0 DR-SR ECMP, we will do another Traceflow but to a different destination. The other T0 Edge node should get involved in the flow.

7.png 

Southbound Traffic behavior

As shown in Sketch2, depending on how eBGP ECMP behavior is configured on the Leaf Switches, traffic can Ingress into the T0 SR Constructs over the 4 different paths. Each Edge node performs local T0 SR and DR routing lookup. To perform the T1 SR lookup, southbound traffic is tunneled to the dedicated T1 Edge Cluster (to the Active Edge node). Once the T1 DR lookup is completed, the traffic is again tunneled to the Compute Transport node to reach the destination segment.

Let’s do a Traceflow from one of the Edge Uplinks to a VM on Segment 172.21.0.0/24 attached to the Tier 1 Gateway and confirm this.

8.png

Summary:

  • Routing always happens closer to the Source
    • For Northbound, it is the Compute Transport nodes
    • For Southbound, it is the Edge Nodes.
  • For a T1 Gateway with an SR Construct on a dedicated Edge Cluster:
    • Tier 0 DR-SR ECMP is achieved on the dedicated T1 Edge cluster to the dedicated T0 Edge Cluster for Segments attached to the T1 Gateway
    • Both Tier 0 SR Constructs are leveraged for Northbound routing
    • All available Tier 0 eBGP Uplink ECMP is utilized for North bound routing (Provided ECMP is enabled under the BGP Process)
    • As Tier 0 SR Constructs are scaled out (upto 8), more ECMP paths become available for the segments connected to the Tier 1 Gateway
    • For any Asymmetric failures on the Edge Uplinks, Inter-SR Routing can be utilized here. 
    • ECMP benefits come with the introduction of dedicated Edge cluster for T1 SR Construct.

Personally I like having dedicated Edge clusters for T1 and T0 Gateways where one Edge cluster is responsible for managing Stateful services and the other responsible for centralized routing. They way we want it is all based on our requirements.

This concludes the 3-part blog series. I hope the articles were informative. 

Thanks for reading.

Continue reading? Here are the other parts:

Part 1 -> https://vxplanet.com/2019/10/26/nsx-t-tier1-sr-placement-and-the-effect-on-northbound-ecmp-part-1/

Part 2 -> https://vxplanet.com/2019/10/28/nsx-t-tier1-sr-placement-and-the-effect-on-northbound-ecmp-part-2/

sketch-1565367997315

 

 

 

 

 

9 thoughts on “NSX-T Tier1 SR Placement and the effect on Northbound ECMP – Part 3

  1. Thanks Hari, this is gold.
    Great info clearly laid out and explained.
    Your site is my number 1 go-to place for anything NSX-T related since I found it 🙂

    Keep publishing!
    Manu
    #iwork4dell

Leave a Reply