vSphere Reference Architecture

Page last updated:

This guide provides reference architectures for Pivotal Cloud Foundry (PCF), including Pivotal Application Service (PAS) and Pivotal Container Service (PKS), on vSphere.

Overview

Pivotal validates the reference architectures described in this topic against multiple production-grade usage scenarios.

This document does not replace the installation instructions provided in the PCF on vSphere Requirements topic. Instead, it gives examples of how to apply those instructions to real-world production environments.

PCF Products Validated Version
vSphere 6.5
VMware NSX-T 2.2
Pivotal Cloud Foundry Operations Manager 2.2 or later
PAS 2.2 or later
PKS 1.1.5 or later

Base vSphere Reference Architecture

This reference architecture includes VMware vSphere and NSX-T, a software-defined network virtualization platform that runs on VMware ESXi virtual hosts.

The reference architecture supports capacity growth at the vSphere and PCF levels as well as installation security through the NSX-T firewall. It allocates a minimum of three servers to each cluster and spreads PCF components across three clusters for high availability (HA). For information about HA in PCF, see High Availability in Cloud Foundry.

To use all features listed in this topic for your PAS installation, you must have Advanced or above licensing from VMware for NSX-T. For PKS, the required licensing is included by default.

If you want to deploy PCF without NSX-T, see vSphere Reference Architecture, which includes NSX-V design considerations.

Non-SDN Deployments

This section provides information about VLAN and container-to-container networking considerations for PCF deployments that do not use software-defined networking (SDN).

VLAN Considerations

Without an SDN environment, you draw network address space and broadcast domains from the greater data center pool. For this approach, you need to deploy PAS and PKS in a particular alignment.

Pas without nsxt

View a larger version of this diagram

When designing a non-SDN environment, ensure that VLANs and address space for PAS and PKS do not overlap. In addition, note that non-SDN PCF environments have the following characteristics:

  • Load balancing is performed external to the PCF installation.
  • Client SSL termination must happen either at the load balancer or Gorouters, or both. Ensure your certificates can accomplish this.
  • Firewall layer is accomplished external to the PCF installation.

Container-to-Container Networking and CNI Considerations

Changing your container-to-container networking strategy after deployment is possible. However, with NSX-T, you need both the VMware NSX-T Container Plug-In for PCF tile and NSX-T infrastructure. For information about deploying PAS on vSphere with NSX-T internal networking using the VMware NSX-T Container Plug-In for PCF tile, see Deploying PAS with NSX-T Networking.

Migrating from a non-SDN environment to an SDN-enabled solution is possible but best considered as a greenfield deployment. Inserting an SDN layer under an active PCF installation is disruptive.

PAS Without SDN

You can use PAS without an SDN overlay. For more information, see PCF on vSphere Requirements.

PKS Without SDN

NSX-T SDN is included in PKS by default, along with the associated licensing. Alternatively, you can use a built-in network stack, Flannel, which handles container networking.

If you want to deploy PKS without NSX-T, select Flannel as your container networking interface in the Networking pane of the PKS tile. For information about configuring your container networking interface, see the Networking section in Installing PKS on vSphere.

In addition, you must define networks for deploying PKS and PKS-provisioned Kubernetes clusters. For information about defining these networks in Ops Manager, see the Create Networks Page section in Configuring Ops Manager on vSphere.

SDN-Enabled Deployments With NSX-T

This section provides information about network requirements, routing, and load balancing for PAS and PKS deployments that use SDN.

Network Requirements for PAS

PAS requires a number of statically defined networks to host the main components it is composed of. For the Tenant side of an NSX-T deployment, a series of non-routable address banks that the NSX-T routers will manage is defined as follows:

Note: These static networks have a smaller host address space assigned than reference designs for PCF v1.xx.

  • Infrastructure: 192.168.1.0/24
  • Deployment: 192.168.2.0/24
  • Services: 192.168.3.0/24

    The Services network can be used with other Ops Manager tiles that you install in addition to PAS. Some of these tiles may require on-demand network capacity. Pivotal recommends that you consider adding a network per tile that needs on-demand resources and pair them up in the tile’s configuration. For more information, see OD-Services# below.

  • OD-Services#: 192.168.4.0 - 192.168.9.0 in /24 segments

    For example, the Redis for PCF tile asks for Network and Services Network. The first one is for placing the broker, and the second one is for deploying worker VMs to support the service. In this scenario, you can deploy a new network OD-Services1 and instruct Redis for PCF to use the Services network for the broker and the OD-Services1 network for the workers. The next tile can use the Services network for the broker and a new OD-Services2 network for workers and so on.

  • Isolation Segments: 192.168.10.0 - 192.168.63.0 in /24 segments

    Isolation segments can be added to an existing PAS installation. This range of address space is used when you add one or more segments. A /24-network in this range should be deployed for each new isolation segment. The capacity of this address bank is sufficient for more than 50 isolation segments, each having more than 250 VMs.

  • PAS Dynamic Orgs: 192.168.128.0/17 = 128 orgs/foundation

    Dynamically assigned Org networks are attached to automatically generated NSX-T Tier-1 routers. Instead of defining these networks in Ops Manager, the operator provides a non-overlapping block of address space. This is configurable in the NCP pane of the VMware NSX-T Container Plug-In for PCF tile in Ops Manager. Every Org receives a new /24 network.

This reference uses a pattern that follows previous references. However, all networks now break on the /24 boundary, and the network octet is numerically sequential (1-2-3).

Network Requirements for PKS

When deploying PKS trough Ops Manager, you must allocate a block of address space for dynamic networks that PKS will deploy per namespace.

Network requirements for PKS are as follows:

  • PKS Clusters: 172.24.0.0/14
  • PKS Pods: 172.28.0.0/14

Summary

The complete addressing strategy for both PAS and PKS is as follows:

  • Infrastructure: 192.168.1.0/24
  • Deployment: 192.168.2.0/24
  • Services: 192.168.3.0/24
  • OD-Services#: 192.168.4.0 - 192.168.9.0 in /24 segments
  • Isolation Segments: 192.168.10.0 - 192.168.63.0 in /24 segments
  • Undefined: 192.168.64.0 - 192.168.127.0
  • PAS Dynamic Orgs: 192.168.128.0/17
  • PKS Clusters: 172.24.0.0/14
  • PKS Pods: 172.28.0.0/14

Network Requirements for External Routing

Routable external IPs on the provider side, for example, for NATs, PAS orgs, and load balancers, are assigned to the T0 router, which is located in front of the PCF installation. There are two approaches to assigning address space blocks to this job:

  • Without PKS or with PKS Ingress: The T0 router needs some routable address space to advertise on the BGP network with its peers. Select a network range with ample address space that can be split into two logical jobs: one job is advertised as a route for traffic, and the other job is for aligning T0 DNATs and SNATs, load balancers, and other jobs. Unlike with NSX-V, SNATs consume much more address space than before.

  • With PKS No Ingress: Compared to the approach above, this approach has much higher address space consumption for load balancer VIPs. Therefore, allow for 4x the address space because Kubernetes service types allocate addresses frequently.

Provider routable address space is /25 or /23 for PKS No Ingress.

NSX-T handles the routing between a T0 router and any T1 routers associated with it.

PAS with NSX-T

Expanding PAS with SDN features is best considered as a greenfield effort. Inserting an SDN layer under a working PAS installation is non-trivial and likely triggers a rebuild. NSX-T constructs that can be used by PAS include the following:

  • Logical networks, encapsulated broadcast domains
  • VLAN exhaustion avoidance through the use of logical networks
  • Routing services and NAT/SNAT to network fence the PCF installation
  • Load balancing services to pool systems such as Gorouters
  • SSL termination at the load balancer
  • Distributed routing and firewall services at the hypervisor

Pas with nsxt

View a larger version of this diagram

The VMware NSX-T Container Plug-In for PCF tile, which provides a container networking stack instead of the built-in Silk solution, interlocks with NSX-T already deployed on the IaaS. This tile cannot be used unless NSX-T has already been established on the IaaS. These technologies cannot be retrofitted into an established PKS or PAS installation.

To deploy PAS with NSX-T, you need to provide the following in the VMware NSX-T Container Plug-In for PCF tile:

  • NSX Manager host
  • Username and password
  • NSX Manager CA certificate
  • PAS foundation name that must match the tag added to T0 router, external NAT pool, and IP block
  • Subnet prefix, which controls the size of each org and defaults to /24, that is, 254 addresses

In addition, you must select Enable SNAT for Container Networks in the tile. For more information about configuring PAS with NSX-T networking, see Deploying PAS with NSX-T Networking.

Each new job, such as an isolation segment, falls to a broadcast domain or logical switch connected to a T1 router acting as the gateway to that network. This approach provides a DNAT and SNAT control point and a firewall boundary.

TThe Services network is an exception. It shares a T1 router with any associated OD-Services# networks. Because these networks are considered part of the same system, inserting any NATs or firewall rules between them is not needed.

Load Balancing for PAS

Without NSX-T, you need to choose a suitable load balancer to send traffic to the Gorouters and other systems. Installations approaching production level typically use external load balancing from hardware appliance vendors or other network-layer solutions.

With NSX-T, load balancing is available in the SDN layer. These load balancers are a logical entity tied to the resources in the Edge Cluster and align to the network or networks represented by a T1 router. They function as a Layer 4 load balancer. SSL termination is available on the NSX-T load balancer. However, Pivotal recommends passing the SSL through to your Gorouters.

Common deployments of load balancing in PAS are as follows:

  • HTTP/HTTPS traffic to and from Gorouters
  • TCP traffic to and from TCP routers
  • MySQL proxies
  • Diego Brain

NSX-T load balancers can support many VIPs. Consider deploying one load balancer per network (T1) and one-to-many VIPs on that load balancer per job. Edge Cluster resources are consumed for load balancing. When designing your deployment, consider how many load balancers are needed and how much capacity is available for them.

BOSH can manage the members of the server pools for the NSX-T load balancers using NS Groups.

PKS with NSX-T

NSX-T SDN is included in PKS by default, along with the associated licensing. To use NSX-T SDN and its dynamic constructs, you need to provide the following information when configuring PKS:

  • NSX Manager host
  • Username and password
  • NSX Manager CA certificate
  • T0 router to connect dynamically created namespace networks to
  • NSX IP block from which to pull networks for pods per namespace
  • NSX IP block from which to pull networks for each new cluster
  • Floating IP pool ID created for externally facing IPs (NATs and VIPs)

For more information about configuring PKS with NSX-T networking, see Installing and Configuring PKS with NSX-T Integration.

Pks with nsxt

View a larger version of this diagram

New T1 routers are deployed on-demand as new namespaces are added to PKS. Because these can grow rapidly, allocate a large Pod IP block such as /14 in NSX-T, and then NSX-T provisions /24 address blocks for new namespaces. Reference the Pod IP block from your PKS tile configuration.

PKS v1.1.5 and later supports multiple master nodes. The number of master nodes is defined per plan in the PKS tile. You must use an odd number of master nodes to allow etcd to form a quorum. Pivotal recommends using at least 1 master node per AZ for HA and disaster recovery.

With NSX-T SDN, networks for both PKS clusters and pods are created dynamically. Pivotal recommends using multiple clusters instead of a single cluster with multiple namespaces. Multiple clusters provide security, customization per cluster, privileged containers, failure domains, and version choice.

Ingress Routing and Load Balancing for PKS

This section provides information about ingress routing (Layer 7) and load balancing (Layer 4) for your PKS deployment.

Ingress Routing

NSX-T provides a native ingress router. Third-party options include Istio or Nginx that are running as containers in the cluster.

Wildcard DNS entries are needed for pointing at the ingress service in the style of Gorouters in PAS. Domain information for ingress is defined in the manifest of your Kubernetes deployment. See the example below.

Note: HTTPS with NSX-T-provided ingress is not currently supported.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: music-ingress
  namespace: music1
spec:
  rules:
  - host: music1.pks.domain.com
    http:
      paths:
      - path: /.*
        backend:
          serviceName: music-service
          servicePort: 8080
Load Balancing

When pushing a Kubernetes deployment with type set to LoadBalancer, NSX-T automatically creates a new VIP for the deployment on the existing load balancer for that namespace.

You need to specify a listening and translation port in the service, along with a name for tagging. You also specify a protocol to use. See the example below.

apiVersion: v1
kind: Service
metadata:
  ...
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080
    protocol: TCP
    name: web

Storage Design

Shared storage is a requirement for PCF. You can allocate networked storage to the host clusters following one of two common approaches, horizontal or vertical. The approach you follow should reflect how your data center arranges its storage and host blocks in its physical layout.

Highly redundant storage systems, such as hyperconverged systems, are the optimal choice for a fully HA PaaS. In addition, hyperconverged systems scale with capacity growth and are aligned to the AZ strategy of PCF.

  • Vertical: You grant each cluster its own dedicated datastores, creating a cluster-aligned storage strategy. vSphere VSAN is an example of this architecture. The vertical alignment is the preferred choice for PCF as it matches the AZ model of the PaaS. Loss of storage in a cluster constitutes loss on an AZ, which is a recoverable failure when at least two more AZs are operational.

    For example, with six datastores ds01 through ds06, you assign datastores ds01 and ds02 to your first cluster, ds03 and ds04 to your second cluster, and ds05 and ds06 to your third cluster. You then instruct your first PCF installation to use ds01, ds03, and ds05 and your second PCF installation to use ds02, ds04, and ds06. In this arrangement, all VMs in the same installation and cluster share a dedicated datastore.

  • Horizontal: You grant all hosts access to all datastores and assign a subset to each installation. If the horizontal alignment is used, loss of storage can potentially impact all the AZs and cause non-recoverable damage. For this approach, strong redundancy design at the storage array is recommended.

    For example, with six datastores ds01 through ds06, you grant all nine hosts access to all six datastores. You then instruct your first PCF installation to use stores ds01 through ds03 and your second PCF installation to use ds04 through ds06.

Note: If a datastore is part of a vSphere storage cluster using sDRS (Storage DRS), you must disable the s-vMotion feature on any datastores used by PCF. Otherwise, s-vMotion activity can rename independent disks and cause BOSH to malfunction. For more information, see How to Migrate PCF to a New Datastore in vSphere.

To improve the resiliency of your deployment, you can use separate storage for the management plane of PCF, Ops Manager and BOSH, as well as separate storage for the blobstore.

Storage Capacity

Pivotal recommends the following capacity allocation for PAS installations:

  • For production use, at least 8 TB of data storage, either as one 8-TB store or a number of smaller volumes adding up to 8 TB. Frequent development may require significantly more storage to accommodate new code and buildpacks.

  • For small installations, 4-6 TB of data storage is recommended.

The primary consumer of storage is NFS or WebDAV.

Note: PCF does not currently support using vSphere storage clusters with the versions of PCF validated for this reference architecture. Datastores should be listed in the vSphere tile by their native name, not the cluster name created by vCenter for the storage cluster.

When planning storage allocation for PKS installations based on Pod and Node needs, consider the following chart.

Storage design

Compute and HA Considerations

In PAS, a consolidation ratio for containers should follow a conservative 4:1 ratio of vCPUs to pCPUs. You can use a more conservative ratio, which is 2:1.

A standard Diego cell is 4x16 (XL) by default. At a 4:1 ratio, you have a 16-vCPU budget, or about 12 containers.

If you want to run more containers in a cell, scale the vCPUs accordingly. However, high core count VMs become increasingly hard to schedule on the pCPU without high physical core counts in the socket.

HA considerations include the following:

  • PAS is not considered HA until at least two AZs are defined. Pivotal recommends using three AZs for HA.

    PCF achieves redundancy through the AZ construct and the loss of an AZ is not considered catastrophic. BOSH Resurrector can replace lost VMs as needed to repair a foundation. Foundation backup and restoration is accomplished externally by BOSH Backup and Restore (BBR). For information about BBR, see Backing Up and Restoring Pivotal Cloud Foundry.

  • PKS has no inherent HA capabilities to design for. To support PKS, design HA at the IaaS, storage, power, and access layers.

Scaling and Capacity Management

This section provides information about scaling and capacity management for PCF.

Small PCF Installations

A small-sized PCF foundation looks as follows:

  • 1 vSphere cluster/1 AZ
  • 2 resource pools: 1 for NSX components and 1 for PAS or PKS
  • 3 hosts minimum for vSphere HA (4 hosts for vSphere VSAN)
  • Shared storage/VSAN
  • 1 NSX Manager
  • 3 NSX controllers
  • 2 large edge VMs in a cluster
  • PCF: Ops Manager, the BOSH Director, and Small Footprint Runtime

This approach is intended for designing a proof of concept or development-only system. Small Footprint Runtime has no upgrade path to the standard PAS tile.

Medium PCF Installations

A medium-sized PCF foundation looks as follows:

  • 2 vSphere clusters/2 AZs
  • 3 resource pools: 1 for NSX components and 2 for PAS/1 for PKS
  • 3 hosts minimum for vSphere HA (4 hosts for vSphere VSAN) per cluster
  • Shared storage/VSAN
  • 1 NSX Manager
  • 3 NSX controllers
  • 2 large edge VMs in a cluster
  • PCF: Ops Manager, the BOSH Director, and PAS/PKS

For this design, Pivotal recommends replacing Small Footprint Runtime with PAS. The second AZ doubles compute capacity and expands the NSX-T footprint.

Production-Ready PCF Installations

A production-ready PCF foundation looks as follows:

  • 3 vSphere clusters/3 AZs
  • 4 resource pools: 1 for NSX components and 3 for PAS/1 for PKS
  • 3 hosts minimum for vSphere HA (4 hosts for vSphere VSAN)
  • Shared Storage/VSAN
  • 1 NSX Manager
  • 3 NSX Controllers
  • 3 large edge VMs in a cluster
  • PCF: Ops Manager, the BOSH Director, and PAS/PKS

PAS and PKS with NSX-T

A fully meshed installation of PCF v2.2 or later designed using the recommendations provided in this topic looks as follows:

Note: You should not share an Ops Manager and BOSH Director installation between PKS and PAS in a production environment.

Pas pks nsxt

View a larger version of this diagram

Common components are the NSX T0 router and the associated T1 routers. This approach ensures that any cross-traffic between PKS and PAS apps stays within the bounds of the T0. This also provides a one-stop access point to the whole installation, which simplifies deployment automation for multiple, identical installations.

Note that each design, green PKS and orange PAS, has its own Ops Manager and BOSH Director installation. You should not share an Ops Manager and BOSH Director installation between PKS and PAS.

AZs are aligned to vSphere clusters, with resource pools as an optional organizational tool to place multiple foundations into the same capacity. You can align PKS to any AZ or cluster. Keeping PKS and PAS in completely separate AZs is not required.

You can use resource pools in vSphere clusters as AZ constructs to stack different installations of PCF. As server capacity continues to increase, the efficiency of deploying independent server clusters only for one installation is low. For example, customers are commonly deploying servers with 768-GB RAM and greater.

To allow for max capacity growth, consider using an NSX-T installation per foundation.

If you want to split the PAS and PKS installations into separate network trees, behind separate T0 routers, ensure that approach meets your needs by reviewing VMware’s recommendations for T0 to T0 routing. For more information, see Reference Design Guide for PAS and PKS with VMware NSX-T Data Center in the VMware documentation.

Create a pull request or raise an issue on the source for this page in GitHub