Growth Beyond Minimal Viable Platform

Page last updated:

Organic growth from the Minimal Viable Platform (MVP) to a more complex system is expected and encouraged. The next sections will offer recommendations on transforming to higher levels of fault tolerant architectures.

Single Cluster Stretched Across Three Racks

Across Racks View a larger version of this diagram.

This second stage design (MVP2) improves upon the capacity of the system and the resiliency of the storage system.

Key items that change are:

  • Two more physical racks are added bringing the starter kit system into a three rack, three availability zone (AZ) configuration.

The maximum number of hosts per cluster for this architecture increases from 64 hosts to 192 hosts and increases the maximum number of powered-on VMs to 40,000.

Unchanged items as compared to MVP:

  • Management functions do not change location

By transforming into a three rack configuration, several HA features from different parts of the configuration align. Each pair of hosts in a rack can be designated a vSAN fault domain. If your vSAN cluster spans across multiple racks or blade server chassis, you can use Fault Domains to protect shared storage against rack or chassis failure. With fault domains, when you add one or more hosts to each fault domain you increase the high availability of shared storage which is a key contributor to improved platform resiliency. This shared storage HA solution is a significant improvement.

vSphere Host Groups are defined which align to the three racks of hosts which then comprise three Availability Zones. These Availability Zones are then exploited by TAS for VMs and TKGI to improve resiliency. The Host Groups feature is a great way to pin the application containers resource usage to an aggregation of computing power and memory that is cluster-aligned without having to manually balance the entire cluster. This is akin to using Resource Pools in the cluster but without having to sub-segment the available computing power and memory using shares.

If the intent is to dedicate the entire installation to application containers, then Host Groups are less valuable as you can just set the entire cluster as an AZ. But, if there is the desire to blend application containers with any other kind of workload, even two different kinds of application container technologies at once, Host Groups fit well to segment capacity without the constraint of resource shares.

Migration from the single rack architecture to using three racks is easy to complete with TAS for VMs by modifying vSphere host group definitions and performing a bosh recreate operation. The PaaS will deploy all non-singleton components evenly across all AZs as long as the instance count of a given component is evenly divisible by the number of AZs. We recommend that you review and adjust the number of non-singleton components defined to achieve the resiliency you need. For example, review the number of Gorouters you are running, and use a count evenly divisible by the number of AZs you have defined, three in this model.

Multiple Cluster Design

TA4V Multiple Cluster Design clusters View a larger version of this diagram.

This third stage design (MVP3) is production-ready with ample high availability and redundant capacity for full-time Enterprise use. This size is appropriate for combined TKGI and TAS for VMs deployments. if required, you may use host groups (or Resource Pools with or without host groups) to organize components within the clusters capacity. There is no practical reason to scale horizontally (more clusters/AZs) from this model as the upper limits of aggregate compute & memory far exceed the capability of the software components that make it up. If true exascale capacity is desired, a better approach is to deploy three of these models in parallel as opposed to one, bigger one.

Key items that change are:

  • High Availability (HA) of the IaaS layer is significantly improved with the addition of new vSAN domains and increased host capacity.
  • High Availability (HA) of the PaaS layer is improved with a separation of management functions from applications and a stronger alignment of PaaS HA (AZs) to IaaS HA constructs, including vSphere HA and vSAN fault tolerance.
  • A new cluster is deployed strictly for management functions, isolating them for resource use/abuse in the application clusters and also isolating the HA that supports them.
  • A second and third application cluster are deployed (for a total of four clusters) resulting in three clusters dedicated for applications.
  • All non-singleton PaaS components are deployed in multiples of three to take advantage of three Availability Zones. HA features of both the IaaS and PaaS are aligned and operating in the AZs.

Migration from the MVP1 model to this multi-cluster architecture directly will be a challenge, as all management VMs from vSphere, NSX-T Data Center and PaaS will be migrated to a dedicated cluster and not blended in with any of the application containers running on TKGI and TAS for VMs. A fresh installation of the PaaS layer (at least) makes this model the easiest to install, as it will give you the opportunity to place management components in the management cluster and evacuate any pre-existing clusters of anything other than application components.

Migrating from the Single Cluster/Three Racks (MVP2) model to this one will also include challenges with relocating management functions from the application cluster to the new management cluster. This can be accomplished as long as there are shared networks and storage between the current management cluster and the new management cluster. The new management cluster should be built so that you can use vMotion and storage-vMotion to migrate objects to the new management cluster. Or, a consolidation of all management functions onto a single rack’s hosts prior to addition of new capacity could be accomplished. The approach is to evacuate all management functions from the existing racks targeted for applications in favor of placement on the new management cluster. How best to accomplish this is dependent upon the architecture of your current environment.

TKGI Multiple Cluster Design

TKGI clusters View a larger version of this diagram.