Select Page

Why Choose a Galera Based Clustering Solution for MySQL

Author: Wesley Lifford | 3 min read | February 2, 2016

You need your application to stay up and available in case of software/OS crash, network issues, or hardware failure. Simple Master/Slave replication is not Highly Available, since there will be some delay in your application’s availability even under the best of circumstances. Master/Master also has its share of downsides (Primary Key collisions and data drift just to name a couple) that are also amplified when you have network related issues. We could even consider using GTID based replication for data integrity across replication chain and use some of the MySQL utilities for faster failover/failback. Each has a trade-off of its own. Galera based clustering solutions eliminates these additional layers and provides looking for a pure synchronous HA solution. Galera based clustering solutions eliminates these additional layers and provides a pure synchronous HA solution.

Galera based clustering offers semi-synchronous replication, active-active multi-master topology (read and write to any node,) automatic cluster node joining, membership control, and dropping upon failure, parallel row-level replication, and no replication lag while providing a standard MySQL look and feel.

Galera based clustering works best when given a minimum of 3 data nodes. Yes, it is possible to run with only 2 data nodes and an arbitrator daemon on a 3rd node, but this configuration leaves you vulnerable to an outage when one of the data nodes suddenly has to assume all client requests because of a problem on the only other data node.

Overall, Galera based clustering is a robust and solid solution for high availability. Running on a 3 node Galera based cluster allows you to distribute writes and reads as you see fit while also allowing you to perform maintenance or schema upgrades without downtime.

Of course, this flexibility also comes with a price. You’ll need at least 3 nodes (and a way to load balance the connections to the cluster.) Because the writes have to be committed on all 3 nodes (take network latency into account), the overall system may be slightly slower than it would running standalone MySQL. The entire cluster will be as slow as the slowest node.

Currently, InnoDB is the only storage engine supported with Galera based replication. A primary key must also be defined on each table. Writes to tables of other storage engines are not replicated (DDL statements to the mysql tables are replicated). This means any other MyISAM tables would need to be converted to InnoDB before using Galera based clustering.

Galera cluster also has limitations on transaction size. The writesets are processed in memory and because of this, large transactions will impact performance. The default transaction limits are set for 128K rows and 1GB transaction size. These can be adjusted, but for best practices and performance, large transactions should be broken up into smaller chunks.

There are currently two forks of MySQL that you can use to setup a Galera based cluster, Percona or MariaDB. Deciding on which to use is a matter of personal preference.

Subscribe to Our Blog

Never miss a post! Stay up to date with the latest database, application and analytics tips and news. Delivered in a handy bi-weekly update straight to your inbox. You can unsubscribe at any time.

Work with Us

Let’s have a conversation about what you need to succeed and how we can help get you there.

CONTACT US

Work for Us

Where do you want to take your career? Explore exciting opportunities to join our team.

EXPLORE JOBS