LATEST VERSION: 1.3 - CHANGELOG
Pivotal Cloud Cache for PCF v1.3

Application Development

Design Patterns

The Inline Cache

An inline cache places the caching layer between the app and the backend data store.

inline caching pattern

The app will want to accomplish CRUD (CREATE, READ, UPDATE, DELETE) operations on its data. The app’s implementation of the CRUD operations result in cache operations that break down into cache lookups (reads) and/or cache writes.

The algorithm for a cache lookup quickly returns the cache entry when the entry is in the cache. This is a cache hit. If the entry is not in the cache, it is a cache miss, and code on the cache server retrieves the entry from the backend data store. In the typical implementation, the entry returned from the backend data store on a cache miss is written to the cache, such that subsequent cache lookups of that same entry result in cache hits.

The implementation for a cache write typically creates or updates the entry within the cache. It also creates or updates the data store in one of the following ways:

  • Synchronously, in a write-through manner
  • Asynchronously, in a write-behind manner

Developers design the server code to implement this inline-caching pattern.

The Look-Aside Cache

The look-aside pattern of caching places the app in charge of communication with both the cache and the backend data store.

look-aside caching pattern

The app will want to accomplish CRUD (CREATE, READ, UPDATE, DELETE) operations on its data. That data may be

  • in both the data store and the cache
  • in the data store, but not in the cache
  • not in either the data store or the cache

The app’s implementation of the CRUD operations result in cache operations that break down into cache lookups (reads) and/or cache writes.

The algorithm for a cache lookup returns the cache entry when the entry is in the cache. This is a cache hit. If the entry is not in the cache, it is a cache miss, and the app attempts to retrieve the entry from the data store. In the typical implementation, the entry returned from the backend data store is written to the cache, such that subsequent cache lookups of that same entry result in cache hits.

The look-aside pattern of caching leaves the app free to implement whatever it chooses if the data store does not have the entry.

The algorithm for a cache write implements one of these:

  • The entry is either updated or created within the data store, and the entry is updated within or written to the cache.
  • The entry is either updated or created within the backend data store, and the copy currently within the cache is invalidated.

Note: SDG (Spring Data GemFire) supports the look-aside pattern, as detailed at Configuring Spring’s Cache Abstraction.

Bidirectional Replication Across a WAN

Two PCC service instances may be connected across a WAN to form a single distributed system with asynchronous communication. The cluster within each of the PCC service instances will host the same region. Updates to either PCC service instance are propagated across the WAN to the other PCC service instance. The distributed system implements an eventual consistency of the region that also handles write conflicts which occur when a single region entry is modified in both PCC service instances at the same time.

In this active-active system, an external entity implements load-balancing by directing app connections to one of the two service instances. If one of the PCC service instances fails, apps may be redirected to the remaining service instance.

This diagram shows multiple instances of an app interacting with one of the two PCC service instances, cluster A and cluster B. Any change made in cluster A is sent to cluster B, and any change made in cluster B is sent to cluster A.

Bidirectional WAN replication pattern

Blue-Green Disaster Recovery

Two PCC service instances may be connected across a WAN to form a single distributed system with asynchronous communication. An expected use case propagates all changes to a region’s data from the cluster within one service instance (the primary) to the other. The replicate increases the fault tolerance of the system by acting as a hot spare. In the scenario of the failure of an entire data center or an availability zone, apps connected to the failed site can be redirected by an external load-balancing entity to the replicate, which takes over as the primary.

In this diagram, cluster A is primary, and it replicates all data across a WAN to cluster B.

WAN replication pattern

If cluster A fails, cluster B takes over.

WAN replicate becomes primary

CQRS Pattern Across a WAN

Two PCC service instances may be connected across a WAN to form a single distributed system that implements a CQRS (Command Query Responsibility Segregation) pattern. Within this pattern, commands are those that change the state, where state is represented by region contents. All region operations that change state are directed to the cluster within one PCC service instance. The changes are propagated asynchronously to the cluster within the other PCC service instance via WAN replication, and that other cluster provides only query access to the region data.

This diagram shows an app that may update the region within the PCC service instance of cluster A. Changes are propagated across the WAN to cluster B. The app bound to cluster B may only query the region data; it will not create entries or update the region.

CQRS WAN replication pattern

Region Design

Cached data are held in GemFire regions. Each entry within a region is a key/value pair. The choice of key and region type affect the performance of the design. There are two basic types of regions: partitioned and replicated. The distinction between the two types is based on how entries are distributed among servers that host the region.

Keys

Each region entry must have a unique key. Use a wrapped primitive type of String, Integer, or Long. Experienced designers have a slight preference of String over Integer or Long. Using a String key enhances the development and debugging environment by permitting the use of a REST API (Swagger UI), as it only works with String types.

Partitioned Regions

A partitioned region distributes region entries across servers by using hashing. The hash of a key maps an entry to a bucket. A fixed number of buckets are distributed across the servers that host the region.

Here is a diagram that shows a single partitioned region (highlighted) with very few entries to illustrate partitioning.

Partitioned Region

A partitioned region is the proper type of region to use when one or both of these situations exist:

  • The region holds vast quantities of data. There may be so much data that you need to add more servers to scale the system up. PCC can be scaled up without downtime; to learn more, see Updating a Pivotal Cloud Cache Service Instance.

  • Operations on the region are write-heavy, meaning that there are a lot of entry updates.

Redundancy adds fault tolerance to a partitioned region. Here is that same region, but with the addition of a single redundant copy of each each entry. The buckets drawn with dashed lines are redundant copies. Within the diagram, the partitioned region is highlighted.

Partitioned Region with Redundancy

With one redundant copy, the GemFire cluster can tolerate a single server failure or a service upgrade without losing any region data. With one less server, GemFire revises which server holds the primary copy of an entry.

A partitioned region without redundancy permanently loses data during a service upgrade or if a server goes down. All entries hosted in the buckets on the failed server are lost.

Partitioned Region Types for Creating Regions on the Server

Region types associate a name with a particular region configuration. The type is used when creating a region. Although more region types than these exist, use one of these types to ensure that no region data is lost during service upgrades or if a server fails. These partitioned region types are supported:

  • PARTITION_REDUNDANT Region entries are placed into the buckets that are distributed across all servers hosting the region. In addition, GemFire keeps and maintains a declared number of redundant copies of all entries. The default number of redundant copies is 1.
  • PARTITION_REDUNDANT_HEAP_LRU Region entries are placed into the buckets that are distributed across all servers hosting the region. GemFire keeps and maintains a declared number of redundant copies. The default number of redundant copies is 1. As a server (JVM) reaches a heap usage of 65% of available heap, the server destroys entries as space is needed for updates. The oldest entry in the bucket where a new entry lives is the one chosen for destruction.
  • PARTITION_PERSISTENT Region entries are placed into the buckets that are distributed across all servers hosting the region, and all servers persist all entries to disk.
  • PARTITION_REDUNDANT_PERSISTENT Region entries are placed into the buckets that are distributed across all servers hosting the region, and all servers persist all entries to disk. In addition, GemFire keeps and maintains a declared number of redundant copies of all entries. The default number of redundant copies is 1.
  • PARTITION_REDUNDANT_PERSISTENT_OVERFLOW Region entries are placed into the buckets that are distributed across all servers hosting the region, and all servers persist all entries to disk. In addition, GemFire keeps and maintains a declared number of redundant copies of all entries. The default number of redundant copies is 1. As a server (JVM) reaches a heap usage of 65% of available heap, the server overflows entries to disk when it needs to make space for updates.
  • PARTITION_PERSISTENT_OVERFLOW Region entries are placed into the buckets that are distributed across all servers hosting the region, and all servers persist all entries to disk. As a server (JVM) reaches a heap usage of 65% of available heap, the server overflows entries to disk when it needs to make space for updates.

Replicated Regions

Here is a replicated region with very few entries (four) to illustrate the distribution of entries across servers. For a replicated region, all servers that host the region have a copy of every entry.

Replicated Region

GemFire maintains copies of all region entries on all servers. GemFire takes care of distribution and keeps the entries consistent across the servers.

A replicated region is the proper type of region to use when one or more of these situations exist:

  • The region entries do not change often. Each write of an entry must be propagated to all servers that host the region. As a consequence, performance suffers when many concurrent write accesses cause subsequent writes to all other servers hosting the region.
  • The overall quantity of entries is not so large as to push the limits of memory space for a single server. The PCF service plan sets the server memory size.
  • The entries of a region are frequently accessed together with entries from other regions. The entries in the replicated region are always available on the server that receives the access request, leading to better performance.

Replicated Region Types for Creating Regions on the Server

Region types associate a name a particular region configuration. These replicated region types are supported:

  • REPLICATE All servers hosting the region have a copy of all entries.
  • REPLICATE_HEAP_LRU All servers hosting the region have a copy of all entries. As a server (JVM) reaches a heap usage of 65% of available heap, the server destroys entries as it needs to make space for updates.
  • REPLICATE_PERSISTENT All servers hosting the region have a copy of all entries, and all servers persist all entries to disk.
  • REPLICATE_PERSISTENT_OVERFLOW All servers hosting the region have a copy of all entries. As a server (JVM) reaches a heap usage of 65% of available heap, the server overflows entries to disk as it need to make space for updates.

Persistence

Persistence adds a level of fault tolerance to a PCC service instance by writing all region updates to local disk. Disk data, hence region data, is not lost upon cluster failures that exceed redundancy failure tolerances. Upon cluster restart, regions are reloaded from the disk, avoiding the slower method of restart that reacquires data using a database of record.

Creating a region with one of the region types that includes PERSISTENT in its name causes the instantiation of local disk resources with these default properties:

  • Synchronous writes. All updates to the region generate operating system disk writes to update the disk store.
  • The disk size is part of the instance configuration. See Configure Service Plans for details on setting the persistent disk types for the server. Choose a size that is at least twice as large as the expected maximum quantity of region data, with an absolute minimum size of 2 GB. Region data includes both the keys and their values.
  • Warning messages are issued when a 90% disk usage threshold is crossed.

Regions as Used by the App

The client accesses regions hosted on the servers by creating a cache and the regions. The type of the client region determines if data is only on the servers or if it is also cached locally by the client in addition to being on the servers. Locally cached data can introduce consistency issues, because region entries updated on a server are not automatically propagated to the client’s local cache.

Client region types associate a name with a particular client region configuration.

  • PROXY forwards all region operations to the servers. No entries are locally cached. Use this client region type unless there is a compelling reason to use one of the other types. Use this type for all Twelve-Factor apps in order to assure stateless processes are implemented. Not caching any entries locally prevents the app from accidentally caching state.
  • CACHING_PROXY forwards all region operations to the servers, and entries are locally cached.
  • CACHING_PROXY_HEAP_LRU forwards all region operations to the servers, and entries are locally cached. Locally cached entries are destroyed when the app’s usage of cache space causes its JVM to hit the threshold of being low on memory.

An Example to Demonstrate Region Design

Assume that on servers, a region holds entries representing customer data. Each entry represents a single customer. With an ever-increasing number of customers, this region data is a good candidate for a partitioned region.

Perhaps another region holds customer orders. This data also naturally maps to a partitioned region. The same could apply to a region that holds order shipment data or customer payments. In each case, the number of region entries continues to grow over time, and updates are often made to those entries, making the data somewhat write heavy.

A good candidate for a replicated region would be the data associated with the company’s products. Each region entry represents a single product. There are a limited number of products, and those products do not often change.

Consider that as the client app goes beyond the most simplistic of cases for data representation, the PCC instance hosts all of these regions such that the app can access all of these regions. Operations on customer orders, shipments, and payments all require product information. The product region benefits from access to all its entries available on all the cluster’s servers, again pointing to a region type choice of a replicated region.

Connect a Sample Java Client App to PCC

  1. Clone the sample Java app at https://github.com/cf-gemfire-org/cloudcache-sample-app.git to use in this demonstration of how to connect an app to a service instance.
  2. Create a region named test as described in Create Regions with gfsh.
  3. Update the manifest in manifest.yml by replacing service0 with the name of your service instance.
  4. Update the gradle.properties with your username and password for the Pivotal Commercial Maven Repository at https://commercial-repo.pivotal.io.
  5. Run
cf push -f PATH-TO-MANIFEST

to deploy the app. Replace PATH-TO-MANIFEST with the path to your modified manifest file. 1. After the app starts, there will be an entry of (“1”, “one”) in the test region.

Create a pull request or raise an issue on the source for this page in GitHub