About Highly Available Clusters
Page last updated:
This topic describes how highly available (HA) clusters work in MySQL for Pivotal Platform and contains information to help you decide whether to use HA clusters.
In an HA cluster topology, a cluster consists of three nodes, where each node contains the same set of data synchronized across nodes. Data is replicated across nodes simultaneously or not written at all if replication fails even on a single node.
With three nodes, HA clusters are highly available and automatically resilient to failure. This is because if a node loses contact with the other two nodes, the other two nodes can remain in contact and continue to accept transactions. Because the number of nodes is odd, the two nodes can assume they represent the primary component. For more information about cluster availability, see Percona XtraDB Cluster: Quorum and Availability of the cluster in the Percona XtraDB Cluster documentation.
In HA topology, the MySQL for Pivotal Platform service uses the following:
Three Switchboard proxies that are co-located on the MySQL nodes. The proxies gracefully handle failure of nodes which enables fast failover to other nodes within the cluster. For more information, see MySQL Proxy.
A jumpbox VM called
mysql-monitorfor monitoring, diagnosing and backing up MySQL nodes. You can run the
mysql-diagtool on the
mysql-monitorVM to view the status of your cluster nodes. For more information about
mysql-diag, see Running mysql-diag.
The following diagram shows an HA cluster in three availability zones (AZs):
For more information about other availability options supported by MySQL for Pivotal Platform, see Availability Options.
Before deciding to use HA clusters for MySQL for Pivotal Platform service plans, consider the following requirements.
When calculating IaaS usage, you must take into account that each HA instance requires three VMs. Therefore, the resources used for an HA plan must be tripled. For more information, see Resource Usage Planning for On-Demand Plans.
To minimize impact of an availability zone (AZ) outage and to remove single points of failure, Pivotal recommends that you provision three AZs if using HA deployments. With three AZs, nodes are deployed to separate AZs.
For more information, see About Availability Zones.
In addition to the standard networking rules needed for MySQL for Pivotal Platform, the operator must ensure HA cluster specific network rules are also configured.
For information about HA cluster specific networking rules, see Required Networking Rules for Highly Available Cluster Plans.
When deployed with HA topology, MySQL for Pivotal Platform runs three nodes. This cluster arrangement imposes some limitations which do not apply to single node or leader-follower MySQL database servers. For more information about the difference between single node, leader-follower, and HA cluster topologies, see Availability Options.
HA clusters perform validations at startup and during runtime to prevent you from using MySQL features that are not supported. If a validation fails during startup, the server is halted and throws an error. If a validation fails during runtime the operation is denied and throws an error.
These validations are designed to ensure optimal operation for common cluster setups that do not require experimental features and do not rely on operations not supported by HA clusters.
For more information about HA limitations, see Percona XtraDB Cluster Limitations in the Percona XtraDB Cluster documentation.
HA clusters only supports the InnoDB storage engine. The InnoDB is the default storage engine for new tables. Pre-existing tables that are not using InnoDB are at risk because they are not replicated within a cluster.
Large DDL statements can lock all schemas. This can be mitigated by using the Rolling Schema Upgrade (RSU) method. For instructions on how to use the RSU method, see Using a rolling schema upgrade (RSU) in MySQL for VMware Tanzu in the VMware Tanzu Support knowledge base.
Large transaction sizes can inhibit the performance of the cluster and apps using the cluster because they are buffered in memory.
You cannot execute a DML statement in parallel with a DDL statement, if both statements affect the same tables. If they are expected in parallel, the DML and DDL statements are both applied immediately to the cluster, causing errors.
All tables must have a primary key. You can use multi-column primary keys. This is because HA clusters replicate using row based replication and ensure unique rows on each instance.
Explicit table locking is not supported. For more information, see EXPLICIT TABLE LOCKING in the Percona XtraDB Cluster documentation.
By default, auto-increment variables are not sequential and each node has gaps in IDs. This prevents auto-increment replication conflicts across the cluster. For more information, see wsrep_auto_increment_control in the Percona XtraDB Cluster documentation.
Users cannot change the session behavior of auto-increment variables. This is because the HA cluster controls the behavior of these variables. For example, if an app runs
SET SESSION auto_increment_increment, the cluster ignores the change.
In InnoDB, some transactions can cause deadlocks. You can minimize deadlocks in your apps by rewriting transactions and SQL statements. For more information about deadlocks, see Deadlocks in InnoDB and How to Minimize and Handle Deadlocks in MySQL documentation.
Table partitioning can cause performance issues in the cluster. This is as a result of the implicit table locks that are used when running table partition commands.
- Round-trip latency between database nodes must be less than five seconds. Latency exceeding this results in a network partition. If more than half of the nodes are partitioned, the cluster loses quorum and becomes unusable until manually bootstrapped. For more information about bootstrapping HA clusters, see Bootstrapping.
This section describes some defaults that MySQL for Pivotal Platform applies to HA components.
State Snapshot Transfer Process
When a new node is added to or rejoins a cluster, a primary node from the cluster is designated as the state donor. The new node synchronizes with the donor through the State Snapshot Transfer (SST) process.
MySQL for Pivotal Platform uses Xtrabackup for SST, which lets the state donor node to continue accepting reads and writes during the transfer. HA clusters default to using rsync to perform SST, which is fast, but blocks requests during the transfer. For more information about Percona XtraBackup SST Configuration in the Percona XtraDB Cluster documentation.
Large Data File Splitting Enabled
By default, HA clusters splits large
Load Data commands into small manageable units.
This deviates from the standard behavior for MySQL.
For more information, see wsrep_load_data_splitting
in the Percona XtraDB Cluster documentation.
Maximum Transaction Sizes
These are the maximum transaction sizes set for HA clusters:
The maximum write-set size allowed in HA clusters is 2 GB. For more information, see wsrep_max_ws_size.
The maximum number of rows each write-set can contain defaults to no limit. For more information, see wsrep_max_ws_rows.
MySQL for Pivotal Platform uses a proxy to send client connections to the healthy MySQL database cluster nodes in a highly available cluster plan. Using a proxy gracefully handles failure of nodes, enabling fast, failover to other nodes within the cluster. When a node becomes unhealthy, the proxy closes all connections to the unhealthy node and re-routes all subsequent connections to a healthy node.
The proxy used in MySQL for Pivotal Platform is Switchboard. Switchboard was developed to replace HAProxy as the proxy tier for the high availability cluster for MySQL databases in MySQL for Pivotal Platform.
Switchboard offers the following features:
MySQL Server Access
MySQL clients communicate with nodes through this network port. These connections are automatically passed through to the nodes.
Switchboard and API
Operators can connect to Switchboard to view the state of the nodes.
For more information about monitoring proxy health status, see Monitoring Node Health.