NSX-T introduced support for BGP AS-Multipath-Relax from version 2.4 which helps to achieve ECMP accross eBGP Leaf Switch Peers that differ in ASN path attribute values, but have the same AS-path length. As I mentioned in my previous article on ECMP routing, lets revisit the requirements to be met to achieve ECMP between the T0 Edges and Leaf Switches.
Firstly, the below BGP attributes of the routes on the SR Components of the Edge nodes should match.
- Weight (set locally on T0 Gateway)
- Local Preference (set locally on T0 Gateway)
- AS Path number and AS Path length (Both leaf switches should be on same BGP ASN). This requirement can be relaxed by using AS-Multipath-Relax which we are covering in this article.
- Origin Code (of received routes from the Leaf switches)
- MED (of routes advertised from the Leaf switches)
- IGP Metric to reach the Leaf switch. It is directly connected, so the metric should be 0.
Secondly, ECMP should be enabled on the BGP process on the Leaf switches, so that Ingress traffic to the NSX-T networks from Leaf switches could utilize multiple NSX-T Edges.
Before we proceed, I would recommend reading my previous article on Tier0 ECMP Routing at the below link because I have linked some of the configurations to that article.
Let’s get started.
T0 Active-Active Gateway and BGP Configuration with L3 Leaf Switches
This is a summary of the deployed configuration.
The BGP Configuration on the T0 Gateway is same as the one covered in my earlier post “NSX-T Tier0 ECMP Routing Explained” (link above) with the below changes.
- I have removed the VLT Configuration from the leaf switches. They now work as standalone Leaf switches peering with the Spines.
- Each Leaf switch is in a separate BGP ASN. Leaf 1 is in BGP 65500 and Leaf 2 in ASN 65200.
- There is an eBGP link between the Leaf switches to get a redundant path to reach NSX-T networks in case of an Edge Uplink failure.
- Similar to earlier configuration, T0 Gateway has 4 Uplinks – Two on Edge node 1 and two on Edge node 2.
- T0 SR Component on Edge node 1 peers with Leaf Switch 1 (on BGP ASN 65500) over VLAN 60 and with Leaf Switch 2 (on BGP ASN 65200) over VLAN 70
- T0 SR Component on Edge node 2 peers with Leaf Switch 1 (on BGP ASN 65500) over VLAN 70 and with Leaf Switch 2 (on BGP ASN 65200) over VLAN 60
- ECMP with Inter-SR Routing is enabled without AS-Multipath-Relax
Lets verify the BGP peering status from the Edge nodes. Each T0 SR Component will have 3 BGP relationship.
- eBGP peering with Leaf Switch 1 (on ASN 65500)
- eBGP peering with Leaf Switch 2 (on ASN 65200)
- iBGP Inter-SR peering (on ASN 65400)
Lets verify the BGP peering status from the Leaf Switches. Each Leaf Switch will have 3 BGP relationship.
- eBGP peering with T0 SR1 on Edge node 1 (on ASN 65400)
- eBGP peering with T0 SR2 on Edge node 2 (on ASN 65400)
- eBGP peering with Leaf Switch 2 (on ASN 65200)
I have used the same networks as in my previous post. I am not covering the advertisement configuration in this post
There are two Tier 1 Gateways attached to the T0 Gateway and advertises the below subnets in BGP:
The Leaf switches advertises the below customer networks:
Egress / Ingress traffic pattern with ECMP ON but with AS-Multipath-Relax OFF
We could see that Each T0 SR Component has 3 entries for the customer networks advertised from the Leaf Switches in it’s BGP table – One via Leaf Switch 1 (ASN 65500), second via Leaf Switch 2 (ASN 65200) and third via iBGP inter-SR link.
Since the ECMP requirements are not met (Different AS Path), BGP Process selects only one best eBGP path and places in the routing table. Let’s verify this from both edge nodes.
This shows that in an Active-Active T0 Gateway peering with multiple Leaf ASNs with ECMP Turned ON and with AS-Multipath-Relax Turned OFF, ECMP is not achieved and both Edge nodes forwards traffic over a single Uplink only.
Egress / Ingress traffic pattern with both ECMP and AS-Multipath-Relax turned ON
Let’s turn on the “AS-Multipath-Relax” feature under the T0 BGP Configuration.
Enabling the Toggle button will add the configuration “bgp bestpath as-path multipath-relax” under the BGP Configuration of both SR Components on the Edge nodes. We can verify this from the Running-Configuration of the SR Component.
Once the feature is enabled, ASN Path attributes are ignored for ECMP calculations, but ASN Path length should be same. In our case, the leaf switches are one hop away, so we should be able to achieve ECMP in this case. Lets verify this from the Edge nodes.
Now we could see that ECMP is achieved through both of the Leaf Switches over different ASNs.
Enabling ECMP on the DellEMC S5048-ON Leaf Switches
To see how to enable ECMP on DellEMC S5048-ON Leaf Switches, please visit my earlier post : https://vxplanet.com/2019/07/27/nsx-t-tier0-ecmp-routing-explained/
We don’t need to enable AS-Multipath-Relax for the leaf switches as the two links from each Leaf to the T0 SR Components (Edges) peer on the same ASN. Each Leaf should be able to ECMP to the NSX-T networks.
In this way, we achieved ECMP for both Egress and Ingress traffic. This shows that in an Active-Active T0 Gateway peering with multiple ASNs, ECMP can be achieved only by turning on the AS-Multipath-Relax Feature. Thus both Edge nodes forwards traffic utilizing all the uplinks.
I hope this post was informative. Thanks for reading.