Sunday, September 14, 2014

Open-Source High-Availability with MySQL Database Fabric

MySQL Fabric is an integrated system for managing a collection of MySQL servers and is the framework on which high-availability and sharding is built. MySQL Fabric is open-source and is intended to be extensible, easy to use, and support procedure execution even in the presence of failure, an execution model we call resilient execution.

MySQL (My Sequel) is one of the world's most widely used open-source relational database management system (RDBMS) . MySQL is a relational database management system (RDBMS), and ships with no GUI tools to administer MySQL databases or manage data contained within the databases.The official set of MySQL front-end tools, MySQL Workbench is actively developed by Oracle, and is freely available for use.

Though MySQL began as a low-end alternative to more powerful proprietary databases, it has gradually evolved to support higher-scale needs as well. There are however limits to how far performance can scale on a single server ('scaling up'), so on larger scales, multi-server. MySQL ('scaling out') deployments are required to provide improved performance and reliability. A typical high-end configuration can include a powerful master database which handles data write operations and is replicated to multiple slaves that handle all read operations. The master server synchronizes continually with its slaves so in the event of failure a slave can be promoted to become the new master, minimizing downtime. Further improvements in performance can be achieved by caching the results from database queries in memory using memcached, or breaking down a database into smaller chunks called shards which can be spread across a number of distributed server clusters.

Ensuring high availability requires a certain amount of redundancy in the system. For database systems, the redundancy traditionally takes the form of having a primary server acting as a master, and using replication to keep secondaries available to take over in case the primary fails. This means that the "server" that the application connects to is in reality a collection of servers, not a single server. In a similar manner, if the application is using a sharded database, it is in reality working with a collection of servers, not a single server. In this case, a collection of servers is usually referred to as a farm. MySQL Fabric - an integrated framework for managing farms of MySQL servers with support for both high-availability and sharding. 

MySQL Fabric is an extensible framework for managing farms of MySQL Servers. Two features have been implemented - High Availability (HA) and scaling out using data sharding. These features can be used in isolation or in combination.

Introduction to Fabric :

To take advantage of Fabric, an application requires an augmented version of a MySQL connector which accesses Fabric using the XML-RPC protocol. MySQL Connectors are used by the application code to access the database(s), converting instructions from a specific programming language to the MySQL wire protocol, which is used to communicate with the MySQL Server processes. A ‘Fabric-aware’ connector stores a cache of the routing information that it has received from the mysqlfabric process and then uses that information to send transactions or queries to the correct MySQL Server. Currently the three supported Fabric-aware MySQL connectors are for PHP, Python and Java (and in turn the Doctrine and Hibernate Object-Relational Mapping frameworks).

Fabric manages sets of MySQL servers that have Global Transaction Identifiers (GTIDs) enabled to check and maintain consistency among servers. Sets of servers are called high-availability groups. Information about all of the servers and groups is managed by a separate MySQL instance, which cannot be a member of the Fabric high-availability groups. This server instance is called the backing store.

Fabric organizes servers in high-availability groups for managing different shards. For example, if standard asynchronous replication is in use, Fabric may be configured to automatically monitor the status of servers in a group. If the current master in a group fails, it elects a new one if a server in the group can become a master.

Besides the high-availability operations such as failover and switchover, Fabric also permits shard operations such as shard creation and removal.

Fabric is written in Python and includes a special library that implements all of the functionality provided. To interact with Fabric, a special utility named mysqlfabric provides a set of commands you can use to create and manage groups, define and manipulate sharding, and much more.

Both features are implemented in two layers:

The mysqlfabric process which processes any management requests. When using the HA feature, this process can also be made responsible for monitoring the master server and initiating failover to promote a slave to be the new master should it fail.

MySQL Fabric-aware connectors store a cache of the routing information that it has fetched from MySQL Fabric and then uses that information to send transactions or queries to the correct MySQL Server.

MySQL Fabric provides high availability and database sharding for MySQL Servers

High Availability

HA Groups are formed from pools of two or more MySQL Servers; at any point in time, one of those servers is the Primary (MySQL Replication master) and the others are Secondaries (MySQL Replication slaves). The role of a HA Group is to ensure that access to the data held within that group is always available.

While MySQL Replication allows the data to be made safe by duplicating it, for a HA solution two extra components are needed and MySQL Fabric provides these:
1) Failure detection and promotion - MySQL Fabric monitors the Primary within the HA group and should that server fail then it selects one of the Secondaries and promotes it to be the Primary
2) Routing of database requests - The routing of writes to the Primary and load balancing reads across the slaves is transparent to the application, even when the topology changes during failover.

Adding MySQL Servers to Create a HA Farm: At this point, MySQL Fabric is up and running but it has no MySQL Servers to manage. This figure shows the what the configuration will look like once MySQL Servers have been added to create a HA server farm.
Three MySQL Servers will make up the managed HA group – each running on a different machine
Sharding - Scaling out

When nearing the capacity or write performance limit of a single MySQL Server (or HA group), MySQL Fabric can be used to scale-out the database servers by partitioning the data across multiple MySQL Server "groups". Note that a group could contain a single MySQL Server or it could be a HA group.

The administrator defines how data should be sharded between these servers; indicating which table columns should be used as shard keys and whether HASH or RANGE mappings should be used to map from those keys to the correct shard as described below:

1) HASH: A hash function is run on the shard key to generate the shard number. If values held in the column used as the sharding key don’t tend to have too many repeated values then this should result in an even partitioning of rows across the shards.

2) RANGE: The administrator defines an explicit mapping between ranges of values for the sharding key and shards. This gives maximum control to the user of how data is partitioned and which rows should be co-located.

When the application needs to access the sharded database, it sets a property for the connection that specifies the sharding key – the Fabric-aware connector will then apply the correct range or hash mapping and route the transaction to the correct shard.

If further shards/groups are needed then MySQL Fabric can split an existing shard into two and then update the state-store and the caches of routing data held by the connectors. Similarly, a shard can be moved from one HA group to another.

The steps that follow evolve that configuration into one containing two shards as shown in the following figure.

Another HA group (group_id-2) is created, from three newly created MySQL Servers then one of the servers is promoted to be the Primary. At this point, the new HA group exists but is missing the application schema and data. Before allocating a shard to the group, a reset master needs to be executed on the Primary for the group (this is required because changes have already been made on that server – if nothing else, to grant permissions for one or more users to connect remotely). The mysqlfabric group lookup_server command is used to first check which of the three servers is currently the Primary. The next step is to split the existing shard, specifying the shard id and the name of the HA group where the new shard will be stored. Python code adds some new rows to the subscribers table  and the tables property for the connection is set  and the key to the value of the sub_no column for that table – this is enough information for the Fabric-aware connector to choose the correct shard/HA group and then the fact that the mode property is set to fabric. MODE_READWRITE further tells the connector that the transaction should be sent to the Primary within that HA group.The mysql client can then be used to confirm that the new data has also been partitioned between the two shards/HA groups.

Current Limitations

1) Sharding is not completely transparent to the application. While the application need not be aware of which server stores a set of rows and it doesn’t need to be concerned when that data is moved, it does need to provide the sharding key when accessing the database.
2) All transactions and queries need to be limited in scope to the rows held in a single shard, together with the global (non-sharded) tables. For example, Joins involving multiple shards are not supported.Because the connectors perform the routing function, the extra latency involved in proxy-based solutions is avoided but it does mean that Fabric-aware connectors are required – at the time of writing these exist for PHP, Python and Java
3) The MySQL Fabric process itself is not fault-tolerant and must be restarted in the event of it failing. Note that this does not represent a single-point-of-failure for the server farm (HA and/or sharding) as the connectors are able to continue routing operations using their local caches while the MySQL Fabric process is unavailable.