vSphere Reference Architecture

Page last updated:

This topic describes a reference architecture for Ops Manager and any runtime products, including VMware Tanzu Application Service for VMs (TAS for VMs) and Tanzu Kubernetes Grid Integrated Edition (TKGI), on vSphere. It builds on the common base architectures described in Platform Architecture and Planning.

For additional requirements and installation instructions for Ops Manager on vSphere, see Installing Ops Manager on vSphere.

Overview

The vSphere reference architecture for the VMware Tanzu Application Service for VMs (TAS for VMs) and Tanzu Kubernetes Grid Integrated Edition (TKGI) runtimes is based on software-defined networking (SDN) infrastructure. vSphere offers NSX-T and NSX-V to support SDN infrastructure.

Ops Manager supports these configurations vSphere deployments:

TAS for VMs on vSphere with NSX-T

These sections describe the reference architecture for TAS for VMs on vSphere with NSX-T deployments. They also provide requirements and recommendations for deploying TAS for VMs on vSphere with NSX-T, such as network, load balancing, and storage capacity requirements and recommendations.

TAS for VMs on vSphere with NSX-T supports these following SDN features:

  • Virtualized, encapsulated networks and encapsulated broadcast domains

  • VLAN exhaustion avoidance with the use of virtualized Logical Networks

  • DNAT/SNAT services to create separate, non-routable network spaces for the TAS for VMs installation

  • Load balancing services to pass traffic through Layer 4 to pools of platform routers at Layer 7

  • SSL termination at the load balancer at Layer 7 with the option to forward on at Layer 4 or 7 with unique certificates

  • Virtual, distributed routing and firewall services native to the hypervisor

Architecture

The diagram below illustrates reference architecture for TAS for VMs on vSphere with NSX-T deployments:

The diagram shows the architecture for a TAS for VMs on vSphere with NSX-T deployment. For more information about the components and networking demonstrated by the diagram, read the description below this diagram.

View a larger version of this diagram.

TAS for VMs deployments with NSX-T are deployed with three clusters and three availability zones (AZs).

An NSX-T Tier-0 router is on the front end of the TAS for VMs deployment. This router is a central logical router into the TAS for VMs platform. You can configure static or dynamic routing using BGP from the routed IP backbone through the Tier-0 router with the edge gateway.

Several Tier-1 routers, such as the router for the TAS for VMs and infrastructure subnets, connect to the Tier-0 router.

NSX-T Container Plugin Requirement

TAS for VMs deployments require the VMware NSX-T Container Plugin for Ops Manager to enable the SDN features available through NSX-T.

The NSX-T Container Plugin enables a container networking stack and integrates with NSX-T.

Note: To use NSX-T with TAS for VMs, the NSX-T Container Plugin must be installed, configured, and deployed at the same time as the TAS for VMs tile. To download the NSX-T Container Plugin, go to the VMware NSX-T Container Plug-in page on VMware Tanzu Network.

Networking

These sections describe networking requirements and recommendations for TAS for VMs on vSphere with NSX-T deployments.

Routable IPs

The Tier-0 router must have routable external IP address space to advertise on the BGP network with its peers. Select a network range for the Tier-0 router with enough space so that the network can be separated into these two jobs:

  • Routing incoming and outgoing traffic
  • DNATs and SNATs, load balancer VIPs, and other Ops Manager components

Note: Compared to NSX-V, NSX-T consumes much more address space for SNATs.

DNS

TAS for VMs requires a system domain, app domain, and several wildcard domains.

For more information about DNS requirements for TAS for VMs, see Domain Names in Platform Planning and Architecture.

Load Balancing

The load balancing requirements and recommendations for TAS for VMs on vSphere with NSX-T deployments are:

  • You must configure NSX-T load balancers for the Gorouters.

    • The domains for the TAS for VMs system and apps must resolve to the load balancer VIP.
    • You must assign either a private or a public IP address assigned to the domains for the TAS for VMs system and apps.
  • VMware recommends that you configure Layer 4 NSX-V load balancers for the Gorouters. With Layer 4 load balancers, traffic passes through the load balancers and SSL is terminated at the Gorouters. This approach reduces overhead processing.

    Note: It is possible to use Layer 7 load balancers and terminate SSL at the load balancers. However, VMware does not recommend this approach, since it adds additional overhead processing.

  • Any TCP Gorouters and SSH Proxies within the platform also require NSX-T load balancers.

  • Layer 4 and Layer 7 NSX-T load balancers are created automatically during app deployment.

Networking, Subnets, and IP Spacing

The requirements and recommendations related to networks, subnets, and IP spacing for TAS for VMs on vSphere with NSX-T deployments are:

  • TAS for VMs requires statically-defined networks to host TAS for VMs component VMs.

  • The client side of an NSX-T deployment uses a series of non-routable address blocks when using DNAT/SNAT at the Tier-0 interface.

  • The reference architecture for TAS for VMs on vSphere with NSX-T deployments uses a pattern in which all networks are calculated on the /24 8-bit network boundary. The network octet is numerically sequential.

  • NSX-T dynamically assigns TAS for VMs org networks and adds a Tier-1 router. These org networks are automatically instantiated based on a non-overlapping block of address space. You can configure the block of address space in the NCP Configuration section of the NSX-T tile in Ops Manager. The default is /24. This means that every org in TAS for VMs is assigned a new /24 network.

For more information about TAS for VMs subnets, see Required Subnets in Platform Architecture and Planning Overview.

High Availability

For information about high availability (HA) requirements and recommendations for TAS for VMs on vSphere, see High Availability in Platform Architecture and Planning Overview.

Shared Storage

Ops Manager requires shared storage. You can allocate networked storage to the host clusters following one of two common approaches: horizontal or vertical. The approach you follow reflects how your data center arranges its storage and host blocks in its physical layout.

Horizontal Shared Storage

With the horizontal shared storage approach, you grant all hosts access to all datastores and assign a subset to each Ops Manager installation.

For example, with six datastores ds01 through ds06, you grant all nine hosts access to all six datastores. You then provision your first Ops Manager installation to use stores ds01 through ds03 and your second Ops Manager installation to use ds04 through ds06.

Vertical Shared Storage

With the vertical shared storage approach, you grant each cluster its own datastores, creating a cluster-aligned storage strategy. vSphere VSAN is an example of this architecture.

For example, with six datastores ds01 through ds06, you assign datastores ds01 and ds02 to a cluster, ds03 and ds04 to a second cluster, and ds05 and ds06 to a third cluster. You then provision your first Ops Manager installation to use ds01, ds03, and ds05, and your second Ops Manager installation to use ds02, ds04, and ds06.

With this arrangement, all VMs in the same installation and cluster share a dedicated datastore.

Storage Capacity

VMware recommends these storage capacity allocations for production and non-production TAS for VMs environments:

  • Production environments: Configure at least 8 TB of data storage. You can configure this as either one 8 TB store or a number of smaller volumes that sum to 8 TB. Frequently-used developments may require significantly more storage to accommodate new code and buildpacks.

  • Non-production environments: Configure 4 to 6 TB of data storage.

Note: Ops Manager does not support using vSphere Storage Clusters with the latest versions of Ops Manager validated for the reference architecture. Datastores should be listed in the vSphere tile by their native name, not the cluster name created by vCenter for the storage cluster.

Note: If a datastore is part of a vSphere Storage Cluster using DRS storage (sDRS), you must disable the s-vMotion feature on any datastores used by Ops Manager. Otherwise, s-vMotion activity can rename independent disks and cause BOSH to malfunction. For more information, see Migrating Ops Manager to a New Datastore in vSphere.

For more information about general storage requirements and recommendations for TAS for VMs, see Storage in Platform Architecture and Planning Overview.

SQL Server

An internal MySQL database is sufficient for use in production environments.

However, an external database provides more control over database management for large environments that require multiple data centers.

For information about configuring system databases on TAS for VMs, see Configure System Databases in Configuring TAS for VMs.

Security

For information about security requirements and recommendations for TAS for VMs deployments, see Security in Platform Architecture and Planning Overview.

Blobstore Storage

VMware recommends that you use these blobstore storages for production and non-production TAS for VMs environments:

  • Production environments: Use an S3 storage appliance as the blobstore.
  • Non-production environments: Use an NFS/WebDAV blobstore.

Note: For non-production environments, the NFS/WebDAV blobstore can be the primary consumer of storage, as the NFS/WebDAV blobstore must be actively maintained. TAS for VMs deployments experience downtime during events such as storage upgrades or migrations to new disks.

For more information about blobstore storage requirements and recommendations, see Configure File Storage in Configuring TAS for VMs for Upgrades.

TAS for VMs on vSphere with NSX-V

These sections describe the reference architecture for TAS for VMs on vSphere with NSX-V deployments. They also provide requirements and recommendations for deploying TAS for VMs on vSphere with NSX-V, such as network, load balancing, and storage capacity requirements and recommendations.

TAS for VMs on vSphere with NSX-V enables services provided by NSX on the TAS for VMs platform, such as an Edge Services Gateway (ESG), load balancers, firewall services, and NAT/SNAT services.

Architecture

The diagram below illustrates the reference architecture for TAS for VMs on vSphere with NSX-V deployments.

The diagram shows the architecture for a TAS for VMs on vSphere with NSX-V deployment. For more information about the components and networking demonstrated by the diagram, read the description below this diagram.

View a larger version of this diagram.

TAS for VMs deployments with NSX-V are deployed with three clusters and three AZs.

TAS for VMs deployments with NSX-V also include an NSX-V Edge router on the front end. You can install the NSX-V Edge router as an ESG or as a distributed logical router (DLR).

The Edge router is a central logical router into the TAS for VMs platform. You can configure VLAN routing from the routed backbone into NSX-V through the Edge router.

Compared to NSX-T architecture, NSX-V architecture does not use Tier-1 routers to connect the central router to the various subnets for the TAS for VMs deployment.

For more information about using ESG on vSphere, see Using Edge Services Gateway on VMware NSX.

Networking

These sections describe networking requirements and recommendations for TAS for VMs on vSphere with NSX-V deployments.

Routable IPs

You must assign routable external IPs on the server side, such as routable IPs for NATs and load balancers, to the Edge router.

DNS

TAS for VMs requires a system domain, app domain, and several wildcard domains.

For more information about DNS requirements for TAS for VMs, see Domain Names in Platform Planning and Architecture.

Load Balancing

The load balancing requirements and recommendations for TAS for VMs on vSphere with NSX-V deployments are:

  • NSX-V includes an Edge router. The Edge router supports ESG. ESG provides load balancing and is configured to route to the TAS for VMs platform.

  • VMware recommends that you configure Layer 4 NSX-V load balancers for the Gorouters. With Layer 4 load balancers, traffic passes through the load balancers and SSL is terminated at the Gorouters. This approach reduces overhead processing.

    Note: It is possible to use Layer 7 load balancers and terminate SSL at the load balancers. However, VMware does not recommend this approach, since it adds additional overhead processing.

  • The domains for the TAS for VMs system and apps must resolve to the load balancer. You must assign either a private or a public IP address assigned to the domains for the TAS for VMs system and apps.

  • Any TCP routers and SSH Proxies also require NSX-V load balancers.

  • VMware recommends that you configure external load balancers in front of the Edge router. For example, you can configure an F5 external load balancer.

Networks, Subnets, and IP Spacing

For information about network, subnet, and IP space planning requirements and recommendations, see Required Subnets in Platform Architecture and Planning Overview.

High Availability

For information about HA requirements and recommendations for TAS for VMs on vSphere, see High Availability in Platform Architecture and Planning Overview.

Shared Storage

TAS for VMs requires shared storage. You can allocate networked storage to the host clusters following one of two common approaches: horizontal or vertical. The approach you follow reflects how your data center arranges its storage and host blocks in its physical layout.

For information about horizontal and vertical shared storage, see Shared Storage.

Storage Capacity

VMware recommends these storage capacity allocations for production and non-production TAS for VMs environments:

  • Production environments: Configure at least 8 TB of data storage. You can configure this as either one 8 TB store or a number of smaller volumes that sum to 8 TB. Frequently-used developments may require significantly more storage to accommodate new code and buildpacks.

  • Non-production environments: Configure 4 to 6 TB of data storage.

Note: Ops Manager does not support using vSphere Storage Clusters with the latest versions of Ops Manager validated for the reference architecture. Datastores should be listed in the vSphere tile by their native name, not the cluster name created by vCenter for the storage cluster.

Note: If a datastore is part of a vSphere Storage Cluster using DRS storage (sDRS), you must disable the s-vMotion feature on any datastores used by Ops Manager. Otherwise, s-vMotion activity can rename independent disks and cause BOSH to malfunction. For more information, see How to Migrate Ops Manager to a New Datastore in vSphere.

For more information about general storage requirements and recommendations for TAS for VMs, see Storage in Platform Architecture and Planning Overview.

SQL Server

An internal MySQL database is sufficient for use in production environments.

However, an external database provides more control over database management for large environments that require multiple data centers.

For information about configuring system databases on TAS for VMs, see Configure System Databases in Configuring TAS for VMs.

Security

For information about security requirements and recommendations for TAS for VMs on vSphere deployments, see Security in Platform Architecture and Planning Overview.

Blobstore Storage

VMware recommends that you use these blobstore storages for production and non-production TAS for VMs environments:

  • Production environments: Use an S3 storage appliance as the blobstore.
  • Non-production environments: Use an NFS/WebDAV blobstore.

Note: For non-production environments, the NFS/WebDAV blobstore can be the primary consumer of storage, as the NFS/WebDAV blobstore must be actively maintained. TAS for VMs deployments experience downtime during events such as storage upgrades or migrations to new disks.

For more information about blobstore storage requirements and recommendations, see Configure File Storage in Configuring TAS for VMs for Upgrades.

TAS for VMs on vSphere without NSX

These sections describe the architecture for TAS for VMs on vSphere without software-defined networking deployments.

Note: This architecture was validated for earlier versions of TAS for VMs. However, it has not been validated for TAS for VMs v2.10.

Networking

Without an SDN, IP allocations all come from routed network space. Discussions and planning within your organization are essential to acquiring the necessary amount of IP space for a TAS for VMs deployment with future growth considerations. This is because routed IP address space is a premium resource, and adding more later is difficult, costly, and time-consuming.

Below is a best-guess layout for IP space utilization in a single TAS for VMs deployment:

  • Infrastructure - /28

  • TAS for VMs deployment - /23
    This size is almost completely dependent on the estimated desired capacity for containers. It can be smaller, but VMware does not recommend using a larger size in a single deployment.

  • Services - /23
    This size is almost completely dependent on the estimated desired capacity for services. Resize as necessary.

Isolation Segments

Isolation segments can help with satisfying IP address space needs in a routed network design. You can build smaller groups of Gorouters and Diego Cells aligned to a particular service. Smaller groups use less IP address space.

TKGI on vSphere with NSX-T

These sections describe the reference architecture for TKGI on vSphere with NSX-T deployments. They also provide requirements and recommendations for deploying TKGI on vSphere with NSX-T, such as network, load balancing, and storage capacity requirements and recommendations.

Architecture

The diagram below illustrates the reference architecture for TKGI on vSphere with NSX-T deployments.

The diagram shows the architecture for a TKGI on vSphere with NSX-T deployment. For more information about the components and networking demonstrated by the diagram, read the description below this diagram.

View a larger version of this diagram.

TKGI deployments with NSX-T are deployed with three clusters and three AZs.

An NSX-T Tier-0 router is on the front end of the TKGI deployment. This router is a central logical router into the TKGI platform. You can configure static or dynamic routing using BGP from the routed IP backbone through the Tier-0 router.

Several Tier-1 routers, such as the router for the infrastructure subnet, connect to the Tier-0 router. New Tier-1 routers are created on-demand as new clusters and namespaces are added to TKGI.

Note: The TKGI on vSphere with NSX-T architecture supports multiple master nodes for TKGI v1.2 and later.

Networking

These sections describe networking requirements and recommendations for TKGI on vSphere with NSX-T deployments.

Load Balancing

The load balancing requirements and recommendations for TKGI on vSphere with NSX-T deployments are:

  • Use standard NSX-T load balancers. Layer 4 and Layer 7 NSX-T load balancers are created automatically during app deployment.

  • Use both Layer 4 and Layer 7 load balancers:

    • Use Layer 7 load balancers for ingress routing.
    • Use Layer 4 load balancers for LoadBalancer services. This allows you to terminate SSL at the load balancers, which reduces overhead processing.
  • NSX-T provides ingress routing natively. You can also use a third-party service for ingress routing, such as Istio or Nginx. You run the third-party ingress routing service as a container in the cluster.

  • If you use a third-party ingress routing service, you must:

    • Create wildcard DNS entries to point to the service.
    • Define domain information for the ingress routing service in the manifest of the TKGI on vSphere deployment. For example:

      apiVersion: extensions/v1beta1
      kind: Ingress
      metadata:
        name: music-ingress
        namespace: music1
      spec:
        rules:
        - host: music1.pks.domain.com
          http:
            paths:
            - path: /.*
              backend:
                serviceName: music-service
                servicePort: 8080
      
  • When you push a TKGI on vSphere deployment with a service type set to LoadBalancer, NSX-T automatically creates a new WIP for the deployment on the existing load balancer for that namespace. You must specify a listening and translation port in the service, a name for tagging, and a protocol. For example:

    apiVersion: v1
    kind: Service
    metadata:
      ...
    spec:
      type: LoadBalancer
      ports:
      - port: 80
        targetPort: 8080
        protocol: TCP
        name: web
    

Routable IPs

The routable IP requirements and recommendations for TKGI with NSX-T deployments are:

  • Deployments with TKGI NSX-T ingres:s VMware recommends a /25 network for deployments with TKGI NSX-T ingress. The Tier-0 router must have routable external IP address space to advertise on the BGP network with its peers.

    Select a network range for the Tier-0 router with enough space so that the network can be separated into these two jobs:

    • Routing incoming and outgoing traffic
    • DNATs and SNATs, load balancer WIPs, and other Ops Manager components

      Note: Compared to vSphere deployments with NSX-V, TKGI on vSphere with NSX-T consumes much more address space for SNATs.

  • Deployments with several load balancers: VMware recommends a /23 network for deployments that use several load balancers. Deployments with several load balancers have much higher address space consumption for load balancer WIPs. This is because Kubernetes service types allocate IP addresses very frequently. To accommodate the higher address space, allow for four times the address space.

Networks, Subnets, and IP Spacing

These considerations and recommendations apply to networks, subnets, and IP spacing for TKGI on vSphere with NSX-T deployments:

  • Allocate a large network block for TKGI clusters and pods:

    • TKGI clusters: Configure a 172.24.0.0/14 network block.
    • TKGI pods: Configure a 172.28.0.0/14 network block.

      NSX-T creates IP address blocks of /24 from these /14 networks by default each time a new cluster or pod is created. You can configure this CIDR range for TKGI in Ops Manager.
  • When deploying TKGI with Ops Manager, you must allow for a block of address space for dynamic networks that TKGI deploys for each namespace. The recommended address space allows you to view a queue of which jobs relate to each service.

  • When a new TKGI cluster is created, TKGI creates a new /24 network from TKGI cluster address space.

  • When a new app is deployed, new NSX-T Tier-1 routers are generated and TKGI creates a /24 network from the TKGI pods network.

  • Allocate a large IP block in NSX-T for Kubernetes pods. For example, a /14 network. NSX-T creates address blocks of /24 by default. This CIDR range for Kubernetes services network ranges is configurable in Ops Manager.

For more information, see Networks in Platform Architecture and Planning Overview.

Multi-Tenancy

For TKGI on vSphere with NSX-T deployments, networks are created dynamically for both TKGI clusters and pods.

To accommodate these dynamically-created networks, VMware recommends that you use multiple clusters, rather than a single cluster with multiple namespaces.

Multiple clusters provide additional features such as security, customization on a per-cluster basis, privileged containers, failure domains, and version choice. Namespaces should be used as a naming construct and not as a tenancy construct.

Master Nodes

The TKGI on vSphere with NSX-T architecture supports multiple master nodes for TKGI v1.2 and later.

You can define the number of master nodes per plan in the TKGI tile in Ops Manager. The number of master nodes should be an odd number to allow etcd to form a quorum.

VMware recommends that you have at least one master node per AZ for HA and disaster recovery.

High Availability

For information about HA requirements and recommendations, see High Availability in Platform Architecture and Planning Overview.

Storage Capacity

VMware recommends the following storage capacity allocation for production and non-production TKGI environments:

TKGI on vSphere supports static persistent volume provisioning and dynamic persistent volume provisioning.

For more information about storage requirements and recommendations, see PersistentVolume Storage Options on vSphere

Security

For information about security requirements and recommendations, see Security in Platform Architecture and Planning Overview.

TKGI on vSphere without NSX-T

You can deploy TKGI without NSX-T. If you want to deploy TKGI without NSX-T, select Flannel as your container network interface in the Networking pane of the TKGI tile.

Select from networks already identified in Ops Manager to deploy the TKGI API and TKGI-provisioned Kubernetes clusters.