High availability is a must have for enterprise deployment of OpenStack. Not high availability of individual virtual machines (instances), nor high availability of individual compute nodes (hypervisor hosts). I mean high availability of the OpenStack controller stack. True, if the stack falls over, that doesn’t necessarily mean that your instances stop working (especially if you’ve carefully placed your networking and storage components on the compute nodes), but we should build the stack so it doesn’t fall over. That means we need to deploy the controller stack with no single points of failure. Two of everything.
When I speak of the controller stack, I’m lumping together everything that doesn’t run on compute nodes. This includes the database, message queue service, web portal, API services and object storage. We’ll need to build at least two of each of these.
For each component, we’ll need to determine whether the component is stateless or stateful. Stateless components are entirely transactional, and don’t store any data. Therefore, there’s no need to cluster or replicate anything between the two instances of these components. The OpenStack API services are stateless for example. Stateful services store data or other state information, and so will require some form of clustering or replication, so that the data will be in sync between the two instances.
Next, we need to determine whether each component will run in an active-active or active-passive configuration. An Active-active configuration provides scalability as well as high availability, so we’ll deploy components in an active-active configuration wherever possible. This will help determine the techniques we use to provide high availability. For example, we could build a MySQL cluster using pacemaker, but that would be active-passive. Instead, we’ll build an active-active MySQL service using Galera.
The diagram shows a total of ten hosts where all of the components are deployed. You could consolidate these onto fewer hosts, but bear in mind that all of these can be virtual machines, so in fact we can deploy these on just two physical servers. To achieve true high availability, we’ll want to place one of each component on physical host A and the other on physical host B, so that if a physical server fails, you still have one surviving instance of every component on the remaining server.
At the top of the stack is a pair of HAProxy nodes. This is our load balancer layer. A virtual IP address (VIP) is defined that we’ll use as our endpoint address for all of the services of the stack. This VIP will be active on only one HAProxy node at a time. That means that HAProxy will be running active-passive mode, however, HAProxy performance is very good, so this won’t create any sort of bottleneck. The keepalived service is used to move the VIP if the active node fails.
Next we’ve got our MySQL database servers. As I mentioned before, we’ll use Galera to do bidirectional replication, which will enable us to use either node for database access. We’ll use the load balancers to distribute load across the nodes.
The OpenStack API services are mostly stateless, so we can simply load balance these. No replication or clustering tools are required. RabbitMQ has native clustering capability, and we’ll enable that. The OpenStack services that use RabbitMQ can be configured to point to multiple RabbitMQ instances. The glance image service will require shared storage so that both instances can access the stored images. We’ll use our highly available swift object store for glance image storage.
The last component is the swift object storage service. The swift storage nodes are designed to replicate using rsync, so high availability is baked in to that component. The swift proxy is stateless, so we can load balance that piece.
Building the Stack
How do we build this? There are a lot of steps involved, but each piece is relatively simple. The hard part is getting your head around the whole design, and hopefully this article will be of help. Once you see what you’re trying to build, you can tackle each piece, one at a time. The major steps are:
Once we’re done, and added a bunch more compute nodes, we should end up with an architecture that looks like this:
I don’t really get into the router in this series. The assumption is that you have two routed subnets to work with. If you’re trying to build this at home or in a minimal test lab, you can build yourself a virtual router. You can learn how I’ve done this in my test lab.
For those of you who read my older article about using pacemaker clustering, I’ll say that I did successfully build the stack as an active-passive configuration using pacemaker, but I found that it was overly complicated and prone to failure. The active-active solution is, in my opinion, much simpler, more resilient, more scalable and therefore more suitable for the enterprise.