Known Issues

Bootstrap and Rejoin-Unsafe Errand Issue

There is an issue in versions 1.8.0 - 1.8.2 in which the bootstrap and rejoin-unsafe errands fail with an error like this:

Error 100: Unable to render jobs for instance group 'rejoin-unsafe'. Errors are:
   - Unable to render templates for job 'rejoin-unsafe'. Errors are:
     - Error filling in template 'config.yml.erb' (line 8: Can't find link 'arbitrator')

This is a bug in those releases. It will be addressed in a coming release. In the meanwhile, it may be necessary to perform these steps manually:

  • For boostrap, you must use the manual bootstrap instructions to restore access to the cluster.
  • For rejoin-unsafe, you must follow these steps:
    1. Log into the node which has tripped the interruptor and become root.
    2. monit stop mariadb_ctrl
    3. rm -rf /var/vcap/store/mysql
    4. /var/vcap/jobs/mysql/bin/pre-start
    5. monit start mariadb_ctrl

Compile fails in environments that do not have access to the Internet

In MySQL for Pivotal Cloud Foundry (PCF) 1.8.0-Edge.5 and 1.8.0-Edge.6, there is a regression which will cause the compile stage to fail while installing the tile on environments that do not have access to the Internet. We regret the error.

MySQL Backups to AWS S3 limited to Standard region

In MySQL for PCF 1.7, backups are only sent to AWS S3 buckets that have been created in the US Standard region, “us-east-1.” This limitation has been resolved in 1.8.0-Edge.2 and later.

Elastic Runtime HTTPS-only feature

Support for the Experimental HTTPS-only feature is broken in MySQL for PCF versions 1.6.X and earlier. The HTTPS-only feature works as designed in MySQL for PCF 1.7.0 and later.

Accidental deletion of a Service Plan

If and only if the Operator does all of these steps in sequence, a plan will become “unrecoverable”:

  1. Click the trash-can icon in the Service Plan screen
  2. Enter a plan with the exact same name
  3. Click the ‘Save’ button on the same screen
  4. Return to the Ops Manager top-level, and click 'Apply Changes’

After clicking 'Apply Changes’, the deploy will eventually fail with the error:

Server error, status code: 502, error code: 270012, message: Service broker catalog is invalid: Plan names must be unique within a service

This unfortunate situation is unavoidable; once the Operator has committed via 'Apply Changes’, the original plan cannot be recovered. For as long as service instances of that plan exist, you may not enter a new plan of the same name. At this point, the only workaround is to create a new plan with the same specifications, but specify a different name. Existing instances will continue to appear under the old plan name, but new instances will need to be created using the new plan name.

If you have committed steps 1 and 2, but not 4, no problem. Do not hit the 'Save’ button. Simply return to the Installation Dashboard. Any accidental changes will be discarded.

If you have committed steps 1, 2 and 3, do not click 'Apply Changes.’ Instead, return to the Installation Dashboard and click the ’Revert’ button. Any accidental changes will be discarded.

Changing service plan definition

In MySQL for PCF versions 1.7.0 and earlier, there is only one service plan. Changing the definition of that plan, the number of megabytes, number of connections, or both, will make it so that any new service instances will have those characteristics.

There is a bug in MySQL for PCF versions 1.6.3 and earlier. Changing the plan does not change existing service instances. Existing plans will continue to be governed by the plan constraints effective at the time they were created. This is true regardless of whether or not an operator runs cf update-service.

There is a workaround for this bug, which will be resolved in future releases of MySQL for PCF. In order for the change to be effective for existing plans, you must trigger this by interacting directly with the service broker: curl -v -k -X PATCH

  • SYSTEM.DOMAIN is defined in Ops Manager, under Elastic Runtime’s Settings tab, in the Cloud Controller entry.
  • BROKER_CREDS_USERNAME and BROKER_CREDS_PASSWORD are defined in Ops Manager, under MySQL for PCF’s Credentials tab, in the Broker Auth Credentials entry.
  • To get each SERVICE_INSTANCE_ID, run cf service INSTANCE --guid. You should see output like this example:
  $ cf service acceptDB --guid

Run this for each service instance to be updated.

Furthermore, if you have changed the max number of connections constraint, then it is necessary to update each bound application’s setting directly from the MySQL console. Follow these steps:

  1. SSH into your Ops Manager Director using these instructions.
  2. Run bosh deployments to discover the name of your MySQL for PCF deployment.
  3. Run bosh ssh using your MySQL for PCF’s deployment name. Example: bosh ssh mysql-partition-9d32f5601988152e869b/0
  4. Run /var/vcap/packages/mariadb/bin/mysql -u root -p.
    • The root user’s password is defined in Ops Manager, under MySQL for PCF’s Credentials tab.
  5. Issue this MySQL command: UPDATE mysql.user SET mysql.user.max_user_connections=NEW_MAX_CONN_VALUE WHERE mysql.user.User NOT LIKE '%root%' ;
    • Make sure to change NEW_MAX_CONN_VALUE to whatever new setting you’ve chosen.
  6. exit;

Proxies may write to different MySQL masters

All proxy instances use the same method to determine cluster health. However, certain conditions may cause the proxy instances to route to different nodes, for example after brief cluster node failures.

This could be an issue for tables that receive many concurrent writes. Multiple clients writing to the same table could obtain locks on the same row, resulting in a deadlock. One commit will succeed and all others will fail and must be retried. This can be prevented by configuring your load balancer to route connections to only one proxy instance at a time.

Number of proxy instances cannot be reduced

Once the product is deployed with operator-configured proxy IPs, the number of proxy instances can not be reduced, nor can the configured IPs be removed from the Proxy IPs field. If instead the product is initially deployed without proxy IPs, IPs added to the Proxy IPs field will only be used when adding additional proxy instances, scaling down is unpredictably permitted, and the first proxy instance can never be assigned an operator-configured IP.

Backups Metadata

In MySQL for PCF 1.7.0, both compressed and encrypted show as N in the backup metadata file. This is due to the fact that MySQL for PCF implements compression and encryption outside of the tool used to generate the file. This is a known defect, and will be corrected in future releases.

MyISAM tables

The clustering plugin used in this release (Galera) does not support replication of MyISAM Tables. However, the service does not prevent the creation of MyISAM tables. When MyISAM tables are created, the tables will be created on every node (DDL statements are replicated), but data written to a node won’t be replicated. If the persistent disk is lost on the node where data is written to (for MyISAM tables only), data will be lost. To change a table from MyISAM to InnoDB, follow this guide.

Max user connections

When updating the max_user_connections property for an existing plan, the connections currently open will not be affected. For example, if you have decreased from 20 to 40, users with 40 open connections will keep them open. To force the changes upon users with open connections, an operator can restart the proxy job. This will cause the connections to reconnect and stay within the limit. Otherwise, if any connection above the limit is reset, it won’t be able to reconnect, so the number of connections will eventually converge on the new limit.

Long SST transfers

We provide a database_startup_timeout in our manifest which specifies how long to wait for the initial SST to complete (default is 150 seconds). If the SST takes longer than this amount of time, the job will report as failing. Versions before cf-mysql-release v23 have a flaw in our startup script where it does not kill the mysqld process in this case. When monit restarts this process, it sees that mysql is still running and exits without writing a new pidfile. This means the job will continue to report as failing. The only way to fix this is to SSH onto the failing node, kill the mysqld process, and re-run monit start mariadb_ctrl.

Long-running queries can be interrupted by load balancer timeout

A connection that is waiting for results will appear to some load balancers as an idle connection. These long-running queries may be interrupted if they exceed the load balancer’s idle timeout. The following error is typical of such an interruption:

Lost connection to MySQL server during query

For example, AWS’s Elastic Load Balancer has a default idle timeout of 60 seconds, so if a query takes longer than this duration then the MySQL connection will be severed and an error will be returned.

To prevent these timeouts, increase the idle timeout duration accordingly.

Create a pull request or raise an issue on the source for this page in GitHub