OpenStack High Availability: MySQL Cluster

By | June 12, 2013

Summary: This article shows how to cluster MySQL on CentOS 6.4 using pacemaker, corosync and drbd.

Note: Since writing this article, I’ve decided that providing HA for OpenStack using pacemaker is overly complicated and prone to failure. Instead, I’ve now adopted a simpler approach using HAProxy for load balancing and MySQL replication with Galera.  Please visit my new article: OpenStack High Availability – Controller Stack. However, this article still describes a viable way to achieve MySQL HA, so feel free to read on…

I’m in the midst of an OpenStack deployment, and my plan is to deliver a resilient infrastructure, one that can withstand the occasional server failure. The goal is to provide adequate redundancy for the key components of the OpenStack infrastructure, such as MySQL, keystone, quantum server, etc. The compute nodes themselves can’t be made highly available. If one pops, the VMs on it will die. However, we can set things up so that those VMs can be restarted on another node.

What’s really important is that the shared cloud services stay running. If your quantum network fails, you’ve lost your whole cloud. If your dashboard and api access goes down, nobody can deploy. If your cinder service goes offline, you’ll lose all your volumes. So, it’s a good idea to make these services redundant.

If you read the OpenStack documentation, it refers to the open source clustering technologies: pacemaker, corosync, and drbd. The documentation provides some sparse configuration information, but it’s far from a complete walk-through. If you follow the links to the primary web sites for these tools, you’ll learn more, but still, I found it difficult to figure it all out. I had to search the web and read a lot of articles before I had any success.

The first step in building a highly available OpenStack environment is to build a MySQL cluster. All of the OpenStack services require access to MySQL (or some other database service), so it better be up and running. Once we’ve got the MySQL cluster built, we’ll have the clustering components installed that we can use for other OpenStack services as well (which I’ll cover in future articles).

The Environment

OK, so I’ve built two CentOS 6.4 servers which will be my controller nodes, running all of the OpenStack services except for compute, which will run on dedicated compute nodes. Though these servers have multiple NICs, I’ll keep things simple and pretend they only have one NIC. In reality, I’ll be using the NIC that is allocated to my management VLAN, which is appropriate for OpenStack services to access the MySQL database. My management network is 192.168.1.0/24, and my two servers are:

  • Node1: 192.168.1.10
  • Node2: 192.168.1.11

I’ll assign a third IP address, 192.168.1.12 which will be the Virtual IP address (VIP) of the MySQL service. This IP will be active on which ever cluster node is currently active, and will move to the other node during a fail-over. This address will be used to access MySQL.

The MySQL data must reside on storage that is accessible by both nodes. This means that it must reside on either shared storage (SAN or NFS), or on storage that is replicated between the nodes (drbd). I’ll show how to do both.

Aside from the shared storage and IP address, the cluster functionality comes from pacemaker and corosync. Pacemaker is a cluster resource manager. It provides the tools to define and group cluster resources, such as the storage, file systems, IP addresses and services, which resources must run together, what order to bring them online, etc. It monitors the state of the resources and brings resources on an off line as needed to keep the resources running. Corosync is a messaging service that enables the cluster nodes to communicate with each other and discover if a node has failed. Drbd (distributed, replicated block device) provides a replicated storage volume.

Building the Cluster

OK, fist thing we need to do is to add a yum repository that contains the cluster packages, then install the packages, so on both nodes (do everything below on both nodes unless otherwise noted), we type the following:

 

yum -y install wget
rpm -Uvh http://elrepo.org/elrepo-release-6-5.el6.elrepo.noarch.rpm
yum -y install drbd84-utils kmod-drbd84 --enablerepo=elrepo
yum -y install pacemaker corosync cluster-glue

Next, we’ll need to alter the firewall rules to allow drbd, corosync and MySQL access. To do this, we’ll edit the file: /etc/sysconfig/iptables. Towards the end of the file, but before any REJECT statements, we add the following lines:

 

-A INPUT -p udp -m state --state NEW -m multiport --dports 5404,5405 -j ACCEPT
-A INPUT -m tcp -p tcp --dport 7788 -j ACCEPT
-A INPUT -m tcp -p tcp --dport 3306 -j ACCEPT

Then we restart the firewall by typing the following command:

 

service iptables restart

Next, we edit our /etc/corosync/corosync.conf file to look like this:

 

totem {
	version: 2
	secauth: off
	threads: 0
	interface {
		ringnumber: 0
		bindnetaddr: 192.168.1.0
		mcastaddr: 226.94.1.1
		mcastport: 5405
		ttl: 1
	}
}

logging {
	fileline: off
	to_stderr: no
	to_logfile: yes
	to_syslog: yes
	logfile: /var/log/cluster/corosync.log
	debug: off
	timestamp: on
	logger_subsys {
		subsys: AMF
		debug: off
	}
}

amf {
	mode: disabled
}

service {
        # Load the Pacemaker Cluster Resource Manager  
        ver:       1
        name:      pacemaker
}

aisexec {
        user:   root
        group:  root
}

Note that the bindaddr is the network address of my management network, and that the mcast address and port must be unique on the network, otherwise this cluster will interact with other clusters (that would be bad!).

Now we’ll set the cluster services to start automatically when the servers boot:

 

chkconfig --level 3 corosync on
service corosync start
chkconfig --level 3 pacemaker on
service pacemaker start

OK, remember, so far, every step we’ve done so far, we’ve done on both nodes right? If not, go back and do everything on the second node. Now once, corosync and pacemaker are running on both nodes, we can check to see if corosync can see both nodes. by typing:

 

corosync-objctl runtime.totem.pg.mrp.srp.members

You should see reference to the IP addresses of both nodes.

OK for the next part, we’ll install MySQL and the crm utility, which we will use to create the pacemaker configuration. Unfortunately, the crm utility is missing from the repositories that we already have, and the one we need has some conflicting versions of our cluster software in it, so we’ll add it, use it, then disable it. Kinda messy. We type the following commands:

 

yum install mysql-server
wget -P /etc/yum.repos.d/ http://download.opensuse.org/repositories/network:/ha-clustering/CentOS_CentOS-6/network:ha-clustering.repo
yum install crmsh

Then we disable the new repo. To do this, edit the file /etc/yum.repos.d/network:ha-clustering.repo, find the line enabled=1 and change it to enabled=0. Again, the steps above are done on both nodes.

Next, we set some cluster settings which are specific to our two-node cluster. This is done on only one node:

 

crm configure
property no-quorum-policy="ignore" pe-warn-series-max="1000" pe-input-series-max="1000" pe-error-series-max="1000" cluster-recheck-interval="5min"
property stonith-enabled=false
commit
exit

DRBD Replicated Storage

You can skip these steps if you’ll be using SAN storage. DRBD is used to replicate data between disks or lvm volumes that are not shared (i.e. locally attached). In my case, I’ve got a second hard disk /dev/sdb on each node. To configure drbd, we edit the /etc/drbd.conf file like so:

 

global { usage-count no; }
resource mysql {
  protocol C;
  startup { wfc-timeout 0; degr-wfc-timeout     120; }
  disk { on-io-error detach; } # or panic, ...
  net {  cram-hmac-alg "sha1"; shared-secret "mySecret"; }
  syncer { rate 10M; }
  on node1.behindtheracks.com {
    device /dev/drbd0;
    disk /dev/sdb;
    address 192.168.1.10:7788;
    meta-disk internal;
  }
  on node2.behindtheracks.com {
    device /dev/drbd0;
    disk /dev/sdb;
    address 192.168.1.11:7788;
    meta-disk internal;
  }
}

Then, we type some drbd commands to initialize the replica:

 

drbdadm create-md mysql
modprobe drbd
drbdadm up mysql

And then on one node only:

 

drbdadm -- --force primary mysql
mkfs.xfs /dev/drbd0
drbdadm secondary mysql

Replication should begin. To see the status of the replication, type the following:

 

cat /proc/drbd

It’s a good idea to wait for replication to be completed before proceeding to the next step. It should only take a minute or two since all we have is an empty file system at this point.

Installing MySQL

OK now, on both nodes, let’s install MySQL

 

yum install mysql-server

Then we’ll add the VIP to our configuration by adding the following line to the /etc/my.cnf on both nodes, under the [mysqld] section:

 

bind-address = 192.168.1.12

Configuring the Cluster Resources

OK, now we configure the cluster resources in pacemaker. This part is done on only one node. We’re going to add resources for the VIP, the file system, the MySQL service, and the drbd device. We then group and order the resources. If we’re using drbd storage, then we enter the following:

 

crm configure
primitive p_ip_mysql ocf:heartbeat:IPaddr2 params ip="192.168.1.12" cidr_netmask="24" op monitor interval="30s"
primitive p_drbd_mysql ocf:linbit:drbd params drbd_resource="mysql" op start timeout="90s" op stop timeout="180s" op promote timeout="180s" op demote timeout="180s" op monitor interval="30s" role="Slave" op monitor interval="29s" role="Master"
primitive p_mysql lsb:mysqld op monitor interval="20s" timeout="10s" op start timeout="120s" op stop timeout="120s"
primitive p_fs_mysql ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/var/lib/mysql" fstype="xfs" options="noatime" op start timeout="60s" op stop timeout="180s" op monitor interval="60s" timeout="60s"
group g_mysql p_ip_mysql p_fs_mysql p_mysql 
ms ms_drbd_mysql p_drbd_mysql meta notify="true" clone-max="2"
colocation c_mysql_on_drbd inf: g_mysql ms_drbd_mysql:Master
order o_drbd_before_mysql inf: ms_drbd_mysql:promote g_mysql:start
commit
exit

If we’re using SAN storage, then we don’t need the drbd resource, and the configuration gets quite a bit simpler. In my case, I established an iSCSI target on /dev/sdb and created a partition /dev/sdb1 and formatted it as ext4. So the configuration becomes:

 

crm configure
primitive p_ip_mysql ocf:heartbeat:IPaddr2 params ip="192.168.1.12" cidr_netmask="24" op monitor interval="30s"
primitive p_mysql lsb:mysqld op monitor interval="20s" timeout="10s" op start timeout="120s" op stop timeout="120s"
primitive p_fs_mysql ocf:heartbeat:Filesystem params device="/dev/sdb1" directory="/var/lib/mysql" fstype="ext4" options="noatime" op start timeout="60s" op stop timeout="180s" op monitor interval="60s" timeout="60s"
group g_mysql p_ip_mysql p_fs_mysql p_mysql 
commit
exit

Once the configuration has been committed, the cluster will bring up the resources, and if all goes well, you should have MySQL running on one of the nodes, accessible via the VIP address. to check the cluster status, type the following command:

 

crm_mon -1

Final MySQL Configuration

You may have a number of configuration steps that you perform on MySQL in your environment, so these steps may not apply to you at all. In any case, I would be remiss to leave them out. For this, we should ssh into the active node. No need to determine which node is active, just ssh to the VIP address, then type the following commands:

 

mysql_secure_installation
mysql -u root -p
grant all on *.* to 'root'@'%' identified by 'yourpassword';
quit;

That’s it. You can test the cluster by rebooting the active node. If you run crm_mon -1 a few times from the other node, you will see the resources start back up on the surviving node. Enjoy your highly available MySQL service.  Stay tuned for future articles about OpenStack high availability as my deployment proceeds!

 

2 thoughts on “OpenStack High Availability: MySQL Cluster

  1. ehung lu

    Hello Brian:

    We have deployed the MySQL HA successfully at two physical computer.

    Then we try to deploy the MySQL HA in the same method at two VMs on openstack, but we got some trouble.

    The two VMs can be synchronized and created a virtual ip , but the same domain VMs can not get response from the virtual ip through ‘ping’ command .

    Could you give me some advice ?

    Thanks in advance.

    BR,

    ehung lu

    Reply
    1. Brian Seltzer

      A quick test to see if your VM’s firewall is blocking the pings, on both VMs, type sudo service iptables stop, then try the ping again. If that works, then you need to adjust your firewall rules.

      Reply

Leave a Reply