VCF 4.0 Federation Explained

VCF Multi-instance management (Federation) was introduced in version 3.9 which allows multiple VCF instances to be managed from a single pane of glass. Each VCF instance which is a part of Federation has visibility to the other Cloud Foundation member instances, their Workload domains and their used and available resources – CPU, memory, and storage. In addition to individual instance resources, each Federation member also sees an aggregated pool of resources available across the federation. This helps a VCF admin in one site to take informed decisions about workload placements by looking at the resource utilization across the federation.

VCF Federation supports both standard and consolidated deployment architectures.

The first VCF instance joining the Federation will become the Fed Controller. Additional Controller instances could be added for resiliency. 3 Controller instances are required to make up a quorum for HA. A maximum of 3 Controller instances are currently supported.When selecting the controller instances, it’s good to choose from different AZ . AZ could be at rack level, datacenter level or geographical locations. Selecting 2 controller instances from the same AZ doesn’t give protection against AZ failures.

Only controller instances can:

  • Invite other member instances to a federation
  • Dismantle a Federation.

The Controller instances also run a Kafka Messaging Broker. A 3 controller instance cluster in the Federation makes up a 3-node Kafka Broker cluster. The kafka broker cluster provides a persistent log storage for the data generated by the Federation member instances. Each federation member is a Kafka Producer as well as a Kafka Consumer. They publish data as events to topics created on the Kafka Broker cluster. Topics have a replication factor of 3, that provides resiliency against a Controller instance failure. Topics are divided into partitions, where the events (data) exactly sits. A leader broker is elected for each partition in a round-robin algorithm, this leader is responsible for ensuring that the replication factor is met for the data in that partition. Federation members subscribe to these topics and consume the messages (data) and update their Federation dashboard.

Only the Controller instances run the Kafka Broker cluster. All Federation members including the controllers are Kafka Producers and Consumers. The Controller cluster also run a Zookeeper ensemble that maintains the Kafka broker configuration and Kafka cluster quorum. All these are wrapped up as a system service called ‘Pantheon’. We will discuss more on this later.

In this blog post on VCF 4.0 Federation, we will:

  • Deploy the first Controller instance and create a Federation
  • Deploy Second and Third Controller instances to make up a quorum for HA
  • Behind the scenes – how the Kafka Messaging System is set up.
  • Aggregated and individual resource availability and utilization dashboards.
  • Dismantle a Federation
Federation Instances

For this blog post, I have created three VCF instances across 3 Availability Zones. I am using a Consolidated architecture to have a lesser footprint with a logical tagging across geographical boundaries.

  • VCF01 – at Bangalore, India (AZ01), with a 4-node Management WLD
  • VCF02 – at Reading, UK (AZ02), with a 4-node Management WLD
  • VCF03 – at new York City, US (AZ03), with a 4-node Management WLD
Deploying the first Controller instance and enabling Federation

Logon to the the VCF01 instance, and from the Multi-Instance Management Tab, click on Create Federation.


Set the Federation Name and a logical Tag as the Member name. Make sure the FQDN matches the SDDC manager instance FQDN.


Wait for 5-10 mins for the controller fabric and Kafka Messaging system to initialize. 


Federation Dashboard comes up and displays a summary about Workload domains, hosts, resource details and the patch status. 


We can switch the dashboard to view aggregated resource summary across the federation.


Deploying the Second Controller Instance

Only controllers can send out invites to other instances to join a Federation. Lets invite VCF02 to join.


Since we are adding VCF02 as a Controller instance, make sure to enable the checkbox. 


An invitation URL will be generated which can be sent to the VCF02 administrator to join.


VCF02 Administrator could also use the Join Federation Option to join, but he would require the unique token generated, which we can copy from below.


Let’s use Join Federation Wizard for VCF02.



If successful, we should see both controller instances showing up in the Federation dashboard. Also see the aggregated inventory details and resources across the federation.


Deploying the Third Controller Instance

The procedure is similar to VCF02, but here we could create invite from any of the controller instances. Lets create it from VCF02. Unlike above, we will use the invitation link to join VCF03 to Federation.


Clicking on the link takes us to the VCF03 join Federation wizard with the Controller and Access Token field populated.


If successful, we should see all the three controller instances showing up in the Federation dashboard.


From a Federation member instance, we should be able to login to other member instances if we have access.


Individual and Aggregated Capacity

From the Federation Dashboard of any member instance, we should be able to see individual resource capacities of each member instances as well as aggregated capacity. This helps a VCF Admin to take informed decisions about workload placement as he now knows about the inventory and the total & available resources across the federation.


Kafka Messaging System


As discussed in the beginning of the article, each controller instance runs the below Kafka components:

  • Kafka Broker – Persistent log storage for the inventory data for the federation member instances
  • Zookeeper – To maintain the cluster quorum as well as store the cluster configuration.
  • Schema Registry – Messages published by each member instance are in avro format. The Schema Registry component stores the schemas of avro messages. It is queried whenever data is written as well as read.

All member instances in the Federation are Kafka Producers and Consumers. They all publish the inventory data to 4 Kafka topics created by the system – fabric, tennant, tennantInstructions and tennantReports. Each topic has a single partition with a replication factor of 3. A partition has a leader component which serves active reads and writes. Writes are asynchronous and the leader partition takes care of syncing the data with the replica peers to satisfy the replication factor. Once a member instance has persisted the inventory data to a kafka topic, other member instances can subscribe to this data and update their Federation dashboard. The messages are in avro format and have a schema attached to it to conform to a defined structure. Schema Registry component takes care of storing the schemas. Whenever a message is written or read, the schema registry component is queried. The Schema registry is stateless, because the schemas itself are stored as topics in the kafka broker cluster.

This entire process runs under the service called ‘pantheon’. Each member in the federation runs this service and should be active to be a part of the Federation.


We can use the Confluent CLI located under /opt/vmware/vcf/pantheon/confluent/bin to query the status of kafka components as well as service specific logs.


To get the path of the data and logs of the services managed by the current confluent run:


Pantheon uses Rockdb as the persistent statestore which is located at /opt/vmware/vcf/pantheon/data/kafka-streams


Pantheon logs are located at /var/log/vmware/vcf/pantheon/


We can use the kafka-topics cli under the confluent home directory to list and describe the topics


As discussed earlier, we could see replication factor of 3 for the topics.

Now let’s use Zookeeper cli to gather info about the kafka cluster and brokers. For this I have downloaded and extracted the Zookeeper Standalone binaries to /home/vcf. 


Using the Zkcli, we could see the kafka broker endpoints, clusterid, topics etc as shown below:


Dismantling a federation

A Federation can be dismantled only once all the member instances have left the Federation. Also dismantling can be performed form a Controller instance only.

Let’s remove VCF01 from the federation.


From the last Controller instance in the Federation, we would be able to dismantle.


VCF Federation is now successfully dismantled and this marks the end of this blog post.

I hope the article was informative.

Thanks for reading

2020-05-03 20_06_59-Photos








Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s