AWS Reference Architecture

This guide presents a reference architecture for Pivotal Cloud Foundry (PCF) on Amazon Web Services (AWS). This architecture is valid for most production-grade PCF deployments using three availability zones (AZs).

See PCF on AWS Requirements for general requirements for running PCF and specific requirements for running PCF on AWS.

PCF Reference Architectures

A PCF reference architecture describes a proven approach for deploying Pivotal Cloud Foundry on a specific IaaS, such as AWS, that meets the following requirements:

  • Secure
  • Publicly-accessible
  • Includes common PCF-managed services such as MySQL, RabbitMQ, and Spring Cloud Services
  • Can host at least 100 app instances, or far more

Pivotal provides reference architectures to help you determine the best configuration for your PCF deployment.

Base AWS Reference Architecture

The following diagram provides an overview of a reference architecture deployment of PCF on AWS using three AZs.

Aws overview arch

View a larger version of this diagram.

Note: Each AWS subnet must reside entirely within one AZ. As a result, a multi-AZ deployment topology requires a subnet for each AZ.

Base Reference Architecture Components

The following table lists the components that are part of a base reference architecture deployment on AWS with three AZs.

Component Reference Architecture Notes
Domains & DNS CF Domain Zones and routes in use by the reference architecture include:

  • domains for *.apps and *.sys (required)
  • a route for Ops Manager (required)
  • a route for ssh access to app containers (optional)
Using Route 53 to manage domains is optional.
Ops Manager Deployed on one of the three public subnets and accessible by FQDN or through an optional jumpbox.
BOSH Director Deployed on the infrastructure subnet.
Elastic Load Balancers - HTTP, HTTPS, and SSL Required. Load balancer that handles incoming HTTP, HTTPS, and SSL traffic and forwards them to the Gorouters. Deployed on all three public subnets.
Elastic Load Balancers - SSH Optional. Load balancer that provides SSH access to app containers. Deployed on all three public subnets, one per AZ.
Gorouters Accessed through the HTTP, HTTPS, and SSL Elastic Load Balancers. Deployed on all three Pivotal Application Service (PAS) subnets, one per AZ.
Diego Brains Required. However, the SSH container access functionality is optional and enabled through the SSH Elastic Load Balancers. Deployed on all three PAS subnets, one per AZ.
TCP Routers Optional feature for TCP routing. Deployed on all three PAS subnets, one per AZ.
CF Database Reference architecture uses AWS RDS. Deployed on all three RDS subnets, one per AZ.
Storage Buckets Reference architecture uses 4 S3 buckets: buildpacks, droplets, packages, and resources.
Service Tiles Deployed on all three service subnets, one per AZ.
Service User & Roles One IAM role and one IAM user are recommended: the IAM role for Terraform, and the IAM user for Ops Manager and BOSH. Consult the following list:

  • Admin Role: Terraform uses this IAM role to provision required AWS resources as well as an IAM user.
  • IAM User: This IAM user with IAM security credentials (access key ID and secret access key) is automatically provisioned with restrict access only to resources needed by PCF.
EC2 Instance Quota The default EC2 instance quota on a new AWS subscription only has around 20 EC2 instances, which is not enough to host a multi-AZ deployment. The recommended quota for EC2 instances is 100. AWS requires the instances quota tickets to include Primary Instance Types, which should be t2.micro.

Network Objects

The following table lists the network objects in this reference architecture.

Network Object Notes Estimated Number
External Public IPs One per deployment, assigned to Ops Manager. 1
Virtual Private Network (VPC) One per deployment. A PCF deployment exists within a single VPC and a single AWS region, but should distribute PCF jobs and instances across 3 AWS AZs to ensure a high degree of availability. 1
Subnets The reference architecture requires the following subnets:
  • 1 x (/24) infrastructure (BOSH Director) subnet
  • 3 x (/24) public subnets (Ops Manager, Elastic Load Balancers, NAT instances), one per AZ
  • 3 x (/20) PAS subnets (Gorouters, Diego Cells, Cloud Controllers, etc.), one per AZ
  • 3 x (/20) services subnets (RabbitMQ, MySQL, Spring Cloud Services, etc.), one per AZ
  • 3 x (/24) RDS subnets (Cloud Controller DB, UAA DB, etc.), one per AZ.
Route Tables This reference architecture requires 4 route tables: one for the public subnet, and one each for all 3 private subnets across 3 AZs. Consult the following list:

  • PublicSubnetRouteTable: This routing table enables the ingress/egress routes from/to Internet through the Internet gateway for Ops Manager and the NAT Gateway.
  • PrivateSubnetRouteTable: This routing table enables the egress routing to the Internet through the NAT Gateway for the BOSH Director and PAS.
For more information, see the Terraform script that creates the route tables and the script that performs the route table association.

Note: If an EC2 instance sits on a subnet with an Internet gateway attached as well as a public IP address, it is accessible from the Internet through the public IP address; for example, Ops Manager. PAS needs Internet access due to the access needs of using an S3 bucket as a blobstore.

Security Groups The reference architecture requires 5 Security Groups. For more information, see the Terraform Security Group rules script. The following table describes the Security Group ingress rules:

Note: The extra port of 4443 with the Elastic Load Balancer is due to the limitation that the Elastic Load Balancer does not support WebSocket connections on HTTP/HTTPS.

Security Group Port From CIDR Protocol Description
OpsMgrSG 22 TCP Ops Manager SSH access
OpsMgrSG 443 TCP Ops Manager HTTP access
VmsSG ALL VPC_CIDR ALL Open up connections among BOSH-deployed VMs
MysqlSG 3306 VPC_CIDR TCP Enable network access to RDS
ElbSG 4443 TCP WebSocket connection to Loggregator endpoint
SshElbSG 2222 TCP SSH connection to containers
Load Balancers PCF on AWS requires the Elastic Load Balancer, which can be configured with multiple listeners to forward HTTP/HTTPS/TCP traffic. Two Elastic Load Balancers are recommended: one to forward the traffic to the Gorouters, PcfElb, the other to forward the traffic to the Diego Brain SSH proxy, PcfSshElb. For more information, see the Terraform load balancers script.

The following table describes the required listeners for each load balancer:
ELB Instance/Port LB Port Protocol Description
PcfElb gorouter/80 80 HTTP Forward traffic to Gorouters
PcfElb gorouter/80 443 HTTPS SSL termination and forward traffic to Gorouters
PcfElb gorouter/80 4443 SSL SSL termination and forward traffic to Gorouters
PcfSshElb diego-brain/2222 2222 TCP Forward traffic to Diego Brain for container SSH connections
Each ELB binds with a health check to check the health of the back-end instances:
  • PcfElb checks the health on Gorouter port 80 with TCP
  • PcfSshElb checks the health on Diego Brain port 2222 with TCP
Jumpbox Optional. Provides a way of accessing different network components. For example, you can configure it with your own permissions and then set it up to access to Pivotal Network to download tiles. Using a jumpbox is particularly useful in IaaSes where Ops Manager does not have a public IP address. In these cases, you can SSH into Ops Manager or any other component through the jumpbox. 1

Integrate PCF with Customer Data Center through VPN

At times, applications on PCF need to access on-premise data. The connection between an AWS VPC and an on-premise datacenter is made through VPN peering. When employing non-VPN peering, there are several points to consider:

  1. Assign routable IP addresses with the following in mind:

    • It may not be realistic to request multiple routable /22 address spaces, due to IP exhaustion.
    • Using different VPC address spaces can cause snowflakes deployments and present difficulties in automation.
    • Only make the load balancer, NAT devices, and Ops Manager routable.
    • PCF components can route egress through a NAT instance. As a result, operators do not need to assign routable IP addresses to PCF components.
  2. Inbound traffic from the datacenter should come through an internal load balancer.

  3. Outbound traffic to the datacenter should go through AWS NAT instances.

Aws vpn

View a larger version of this diagram.

Create a pull request or raise an issue on the source for this page in GitHub