NSX-T Edge Nodes come in two form factors – VM and Baremetal both leveraging Intel DPDK (Data Plane Development Kit) acceleration for the transport and Uplink networks. Deciding which form factor to use depends upon on our use case requirements and it is good to understand the workload traffic behavior and virtualized services requirement before finalizing the Edge deployment form factor. This is because different formfactors have got different upper limits with Baremetal Edges having the highest.
Some key areas to look at before deciding the Edge deployment formfactor are:
- North bound traffic pattern
- Any L2 Bridging requirements
- Number of ECMP paths
- External storage access requirements for the overlay VMs
- Number of NSX services to be deployed (NAT, Load balancers, Edge firewalls etc)
- Future scalability of the platform
- Upper limits on the capabilities
- SSL Offloading on the NSX Load balancers
- Edge Performance
- Quick failure detection and failover
- Number of uplinks / Edge node
- Plans to have a dedicated Edge cluster or rack?
- Networking / Design simplicity
This article is about a comparison between the Baremetal and VM form factors for the Edge nodes. I hope this article would give a good understanding before choosing the right Edge deployment type for your workloads. Let’s get started.
VM Form factor : Edge VMs can be deployed in either of two ways:
- On vSphere DVS (dedicated Edge cluster or shared Management-Edge cluster)
- On NSX-T NVDS (shared Compute-Edge cluster)
Edge VMs are deployed with 4 vNICs – one for management, another for Transport network (Geneve encapsulation), and last two for Uplinks. All these vNICs attach to vSphere DVSwitch Port groups or to N-VDS VLAN Logical segments on the ESXi host networking. Each Edge node will have an Overlay and Uplink N-VDS (when configured as Transport node) which attaches to the vNICs which infact is attached to the DVS Port Groups or NVDS VLAN Segments. This could be realized as a Nested N-VDS networking scenario. This adds a bit to networking design complexity. The below sketch for the Edge VM networking on a DVS/NVDS depicts this scenario.
More details on deploying Edge VMs can be found in my other articles below:
Deploying Edge VMs on vSphere DVS :
Deploying Edge VMs on NVDS :
Baremetal Form factor : Baremetal Edges an be deployed in either of two ways:
- Single-NVDS Multi-TEP design
- Multi-NVDS Multi-TEP design
- Multi-NVDS Single-TEP design (not recommended since version 2.4)
The number of pNICs needed for Baremetal Edges varies with the design that we adopt. Unlike VM Edges, the Overlay and Uplink N-VDS attaches directly to the pNICS of the Baremetal Edges. No Nested N-VDS networking happens in this case. Networking design complexity is simplified. This is how the Single-NVDS Multi-TEP Baremetal Edge networking looks like:
More details on deploying Baremetal Edges can be found in my other articles below:
Single-NVDS Multi-TEP Baremetal Edges :
Multi-NVDS Single-TEP Baremetal Edges :
Both Baremetal and VM Edges support DPDK acceleration for the Transport and Uplink networks. Baremetal Edges support upto 8 DPDK interfaces currently.
Uplink BGP ECMP
Edge VMs are usually deployed with 4 vNICs and have only 2 uplink interfaces. Each uplink attaches to a VLAN PG or N-VDS LS on the host networking where the tagging is applied. This means that a T0 Gateway SR Construct can get a maximum of only 2 BGP ECMP uplink paths per Edge VM. With a two VM Edge node cluster, the maximum BGP uplink ECMP paths are 4.
Baremetal Edges on the other hand, overcome this limitation as the VLAN tagging for the T0 gateway uplinks are applied directly using NSX-T logical segments. The pNICs of the Baremetal Edge attach to tagged interfaces on the Leaf switches. We could have upto 8 BGP ECMP uplink paths per Baremetal Edge node (which is the currently supported upper limit). In reality, we won’t require 8 uplinks but just to mention the difference in Uplink BGP ECMP paths.
For more details on ECMP, please visit my article in the below link:
Deployment Mode and Throughput
VM Form factor: In most cases, the deployment mode is Active-Active to take advantage of ECMP and higher throughput. In Active-Active mode, stateful NSX services are not supported (For eg: you can have only Reflexive NAT on the T0 router). This becomes like a decision choice to us on whether we need better throughput or support stateful services. Active-Passive mode supports stateful services but with reduced throughput.
Baremetal Form factor : With release 2.4, 25G/40G DPDK NIC support is available for the baremetal Edges. We can achieve better throughput in Active-Passive mode with Teaming (LACP) for the Edge Uplinks and take advantage of stateful services. A single Edge (1x25G/40G uplink or 2x25G/40G uplink in LACP) could satisfy the North bound traffic requirements and the second Edge can take over when the first instance fails.
For VM Edges, the dot1q tagging for the T0 Gateway uplinks are applied at the host networking DVS or host N-VDS level and not at the Edge VM N-VDS. For Baremetal Edges,the dot1q tagging applies directly at it’s N-VDS level.
VM Form factor : Can leverage the existing ESXi clusters for deployment. This can be a dedicated Edge cluster, or a shared Management & Edge cluster or a shared Compute and Edge cluster.
Baremetal Form factor: Adds to Capex as it requires a dedicated hardware and that listed in the compatibility matrix below:
BFD and Failover Convergence
VM Form factor : When deployed with BFD, failover and convergence takes approximately 3 seconds.
Baremetal Form factor : Provides sub-millisecond BFD convergence times. During a failover, the other edge node can take over in less than a second (~750 ms). This is true for convergence as well.
Having dedicated Baremetal Edge cluster adds to a simpler Edge cluster design than having a dedicated vSphere Edge cluster with VM Edge nodes.
Host Upgrades and maintenance
Admins need to consider Edge VM availability before performing host maintenance activities adding a dependency to the host maintenance procedure.
Unlike VM Edges, host Upgrades and maintenance can be performed independently for Baremetal Edges. NSX-T Baremetal Edges are deployed outside of the vSphere platform and there is no direct vSphere dependency.
Edges for KVM hosts
If the NSX-T platform is only KVM based, we have the option to deploy only Baremetal Edges. Edges in VM form factor can be deployed only on ESXi hosts and not on KVM hosts.
Overlay-VLAN L2 Bridging
It is recommended to use Baremetal Edge clusters for Overlay-VLAN L2 bridging to achieve better throughput for the bridged network by leveraging DPDK acceleration. Having bridge instances on the VM Edges for high data transfers could lead to performance bottlenecks.
NSX Loadbalancer SSL Offloading
It is recommended to use Baremetal Edges for NSX Loadbalancer SSL offloading as it supports higher TPS. SSL Offloading is resource intensive and VM Edges would possibly create performance issues.
Upper limits for NSX-T Services
Baremetal Edges got higher upper limits than VM form factor for the NSX Services like Loadbalancers, VPN, NAT etc. We can get the configuration maximums here:
NSX-T Loadbalancer instances
VM Form factor : On a Large sized Edge VM Form factor, it supports 1 Large Loadbalancer Instance or 4 Medium Loadbalancer Instances or 40 Small Loadbalancer Instances
Baremetal Form factor : On Baremetal Edge formfactor, it supports much higher numbers than VM Edges. It supports 18 Large Loadbalancer Instances or 75 Small Loadbalancer Instances or 750 Small Loadbalancer Instances.
NSX-T Loadbalancer Pool members
Baremetal Edges support as large as 30,000 pool members when compared to 7,500 on VM Edges with Large Form factor.
For NFV workloads, it is recommended to use Baremetal Edges as per VMware Reference Architecture.
Dedicated Management interface
Edge VMs require a dedicated management interface. NSX-T 2.4 introduced the option on Bare Metal Edge node to support the management on the fast path NICs, no longer requiring a dedicated management NIC.
I hope there could be more comparison data, but this is all that I have for now. To conclude, Baremetal Edges provide better performance with sub-second convergence, faster failover, greater throughput, higher upper limits and design simplicity. For production deployments with throughput demanding applications, it would be good to use Baremetal edges considering the future scalability of the platform and to support increasing NSX-T services. Workload nature and traffic patterns could vary over the time and could not be predicted accurately, this might put Edges in VM Form factor to end up in performance bottlenecks at a future point of time. If that is the case, we will have a tough time migrating services from Edge VM cluster to Edge Baremetal cluster.
I hope this post was informative.
Thanks for reading
5 thoughts on “NSX-T Edges – Baremetal vs VM Comparison”
Hi Harikrishnan T
Your blog is not exactly written.
-Please update your diagram. VTEP stands for VxLAN. But here we talk about a GENEVE Tunnel End Point (TEP)
-We have support of deployments with only a single N-VDS, 3 N-VDS are typically used so far, but a single N-VDS is supported
-Bare metal can have more than 4 DPDK interfaces, the current limitation is 8
-Your bare metal design is something what we from VMware dont recommend when we have 4 interfaces. We use multi-TEP for the GENEVE traffic, so that we can achieve redundandcy when a single links fails, instead failover to the standby edge.
-in the bare metal design we do the dot1q tagging at the segment level, this is unclear in your diagram.
-we support as well 40G interface for bare metal, not only 25G
-this statement is wrong: “It is recommended to attach VM Edges to vSphere DVS/VSS and not to the N-VDS”, just think about how VCF is deployed and how we do that there.
-today (June 2019) we dont support vMotion on vm-based edge node. By the end of 2019, this should be fully supported
Thanks, lots of information in your feedback. Really appreciate it. I will make the necessary modifications to the content over the weekend.
I have now made the necessary corrections. Thanks for your feedback
Hi sir I hope you are doing well, and thank you for your good post, I have a question, for bare metal servers such as Kubernetes cluster or docker or Linux servers, is there any solution with NSX-T ?
For Kubernetes , NSX-T can be used as the CNI. There is NSX-T Container plugin (NCP) for Kubernetes that talks to NSX-T Manager cluster and provisions the necessary pod segments and Tier 1’s for Kubernetes. We can also configure Baremetal linux servers and Windows Server 2016 for NSX-T for micro-segmentation use cases.