High availability refers to the practice of keeping online resources available through node failure or system maintenance. This guide will demonstrate a method for using two Linodes to keep a website online, even when the node initially hosting it is powered off. IP failover, Heartbeat 3.0, Pacemaker 1.1, and Apache 2.2 will be used for this example configuration.
As high availability is a complex topic with many methods available for achieving various goals, it should be noted that the method discussed here may not be appropriate for some use cases. However, it should provide a good foundation for developing a customized HA solution, and this configuration would work well for a static site. To support a dynamic website, you might want to implement additional features, such as distributed shared storage for your database files or daemons. While such a setup is beyond the scope of this guide, it will be covered in a future tutorial.
This guide assumes you have two active Linodes on your account, and that both are freshly deployed Fedora 13 instances. If you only have one Linode on your account, you may add another by clicking the "Add a Linode to this Account" link on the "Linodes" tab of the Linode Manager.
Important: Both Linodes must reside in the same datacenter for IP failover (a required component of this guide) to work. Future HA guides will address combining the principles demonstrated in this tutorial with cross-datacenter clustering techniques.
Contents
Throughout this document, the following terms are used:
You should substitute your own values for these terms wherever they are found in this document.
Choose one Linode to serve as the "primary" node. Log into it via SSH as root and edit its /etc/hosts file to resemble the following:
File: /etc/hosts (on primary node)
127.0.0.1 localhost.localdomain localhost 12.34.56.78 ha1.example.com ha1 98.76.54.32 ha2.example.com ha2
Remember to substitute your primary and secondary Linode's IP addresses for 12.34.56.78 and 98.76.54.32, respectively, along with appropriate hostnames for each. You will find the IP addresses for your Linodes on their "Remote Access" tabs in the Linode Manager.
For the sake of simplicity, it is recommended that you keep the short hostnames assigned as ha1 and ha2. Next, issue the following commands to generate SSH keys for the root user on each VPS, synchronize their SSH host keys, set their hostnames, and allow passwordless logins from each to the other. SSH host key synchronization will prevent issues with key checking later on, which might otherwise occur should you need to perform an SSH login via a DNS name pointing to a floating IP while the secondary node is serving your content. You will be prompted to assign passphrases to the SSH keys; this is optional, and you may skip this step by pressing the "Enter" key.
ssh-keygen -t rsa scp ~/.ssh/id_rsa.pub root@ha2:/root/ha1_key.pub ssh root@ha2 "ssh-keygen -t rsa" ssh root@ha2 "echo \`cat ~/ha1_key.pub\` >> ~/.ssh/authorized_keys2" ssh root@ha2 "rm ~/ha1_key.pub" scp root@ha2:/root/.ssh/id_rsa.pub /root cat ~/id_rsa.pub >> ~/.ssh/authorized_keys2 rm -f ~/id_rsa.pub scp /etc/ssh/ssh_host* root@ha2:/etc/ssh/ rm -f ~/.ssh/known_hosts ssh root@ha2 "service sshd restart" scp /etc/hosts root@ha2:/etc/hosts echo "HOSTNAME=ha1" >> /etc/sysconfig/network hostname "ha1" ssh root@ha2 "echo \"HOSTNAME=ha2\" >> /etc/sysconfig/network" ssh root@ha2 "hostname \"ha2\""
By default, when Linodes are booted DHCP is used to assign IP addresses. This works fine for cases where a Linode will only have one IP address, as DHCP will always assign that IP to the Linode. If a Linode has or may have multiple IPs assigned to it, an explicit static configuration is required, as is the case with this configuration.
On the primary Linode, edit the /etc/sysconfig/network-scripts/ifcfg-eth0 file to resemble the following, making sure the values entered match those shown on the "Remote Access" tab for the primary Linode:
File: /etc/sysconfig/network-scripts/ifcfg-eth0 (on primary node)
DEVICE=eth0 BOOTPROTO=none ONBOOT=yes IPADDR=12.34.56.78 NETMASK=255.255.255.0 GATEWAY=12.34.56.1
Issue the following command to restart networking on the primary Linode:
service network restart
On the primary Linode, edit the /etc/resolv.conf file to resemble the following. Replace 11.11.11.11 and 22.22.22.22 with the DNS servers listed on the Linode's "Remote Access" tab in the Linode Manager.
File: /etc/resolv.conf (on primary node)
nameserver 11.11.11.11 nameserver 22.22.22.22 options rotate
On the secondary Linode, edit the /etc/sysconfig/network-scripts/ifcfg-eth0 file to resemble the following, making sure the values entered match those shown on the "Remote Access" tab for the secondary Linode:
File: /etc/sysconfig/network-scripts/ifcfg-eth0 (on secondary node)
DEVICE=eth0 BOOTPROTO=none ONBOOT=yes IPADDR=98.76.54.32 NETMASK=255.255.255.0 GATEWAY=98.76.54.1
On the primary Linode, issue the following commands to copy the /etc/resolv.conf file to the secondary Linode and restart networking:
scp /etc/resolv.conf root@ha2:/etc/ ssh root@ha2 "service network restart"
First, add a second IP address to your primary Linode by navigating to the "Extras" tab for this Linode and select one additional IP as an extra. After purchasing the additional IP, visit the "Remote Access" tab for the primary Linode and make a note of the newly added IP address. This will serve as your "floating" IP.
Next, navigate to the "Remote Access" tab for the secondary Linode and click the "IP Failover linkage" link in the bottom left of the screen. Check the box next to the newly added IP address and click the "Submit" button. You will see a message above the IP list informing you that IP failover configuration for this Linode has been updated.
Important(1): After configuring IP failover linkage, reboot both Linodes from their "Dashboard" tabs in the Linode Manager. This will allow the new IP address to be routed properly, and will allow you to verify that both Linodes come back up properly.
Important(2): Before proceeding, add a DNS entry for your example site ("catalog.example.com" in our case), pointing to the newly assigned "floating" IP address. Please note that DNS may take some time to propagate fully; for testing purposes, you may wish to add an entry to your workstation's /etc/hosts (MacOS X or Linux) file, pointing the test site to the floating IP. Microsoft Windows users may add an entry to their local workstations's hosts file as well, although its location will vary according to the version of Windows installed. As a third option, you may add an entry to your LAN's local DNS server to point the site to the floating IP.
On the primary Linode, issue the following commands to install Heartbeat, Pacemaker, and Apache 2. The second set of commands will ensure that the same packages are installed on the secondary Linode as well.
yum update -y yum install heartbeat pacemaker httpd -y chkconfig httpd off mkdir -p /srv/www/example.com/catalog/public_html mkdir /srv/www/example.com/catalog/logs ssh root@ha2 "yum update -y" ssh root@ha2 "yum install heartbeat pacemaker httpd -y" ssh root@ha2 "chkconfig httpd off" ssh root@ha2 "mkdir -p /srv/www/example.com/catalog/public_html" ssh root@ha2 "mkdir /srv/www/example.com/catalog/logs"
After issuing the commands listed above, the required packages will be installed. Additionally, the system startup links for Apache will be removed on both Linodes, as Pacemaker will be responsible for starting and stopping the web server daemon as necessary.
On the primary Linode, create a file named /etc/httpd/conf.d/catalog.example.com.conf with the following contents. Substitute the "floating" address for 55.55.55.55 in the example shown below. In this example, the site "catalog.example.com" is being made highly available.
File: /etc/httpd/conf.d/catalog.example.com.conf (on primary node)
NameVirtualHost 55.55.55.55
<VirtualHost 55.55.55.55:80>
ServerAdmin support@example.com
ServerName catalog.example.com
DocumentRoot /srv/www/example.com/catalog/public_html/
ErrorLog /srv/www/example.com/catalog/logs/error.log
CustomLog /srv/www/example.com/catalog/logs/access.log combined
</VirtualHost>
On the primary Linode, issue the following command to copy the Apache configuration file to the secondary Linode:
scp /etc/httpd/conf.d/catalog.example.com.conf root@ha2:/etc/httpd/conf.d/
On the primary Linode, create a test page showing content being served from ha1:
File:/srv/www/example.com/catalog/public_html/index.html (on primary node)
<html>
<head>
<title>Test page served from ha1</title>
</head>
<body>
<h1>Test page served from ha1</h1>
</body>
</html>
On the secondary Linode, create a test page showing content being served from ha2:
File:/srv/www/example.com/catalog/public_html/index.html (on secondary node)
<html>
<head>
<title>Test page served from ha2</title>
</head>
<body>
<h1>Test page served from ha2</h1>
</body>
</html>
In this example, you've configured different pages to be served from each node, depending on which is active. This is good for testing, but in a production setup you'd want to host exactly the same files on each node, making sure to synchronize them any time changes were made to the site.
On the primary Linode, create a file named /etc/ha.d/ha.cf with the following contents. Replace 98.76.54.32 with the statically assigned IP address of the secondary Linode.
File: /etc/ha.d/ha.cf (on primary node)
logfile /var/log/heartbeat.log logfacility local0 keepalive 2 deadtime 15 warntime 5 initdead 120 udpport 694 ucast eth0 98.76.54.32 auto_failback on node ha1 node ha2 use_logd no crm respawn
On the secondary Linode, create a file named /etc/ha.d/ha.cf with the following contents. Replace 12.34.56.78 with the statically assigned IP address of the primary Linode.
File: /etc/ha.d/ha.cf (on secondary node)
logfile /var/log/heartbeat.log logfacility local0 keepalive 2 deadtime 15 warntime 5 initdead 120 udpport 694 ucast eth0 12.34.56.78 auto_failback on node ha1 node ha2 use_logd no crm respawn
On the primary Linode, create the file /etc/ha.d/authkeys with the following contents. Make sure to change "CHANGEME" to a strong password consisting of letters and numbers.
File: /etc/ha.d/authkeys (on primary node)
auth 1 1 sha1 CHANGEME
Issue the following commands to set proper permissions on this file, copy it to the secondary Linode, and start the Heartbeat service on both nodes:
chmod 600 /etc/ha.d/authkeys service heartbeat start scp /etc/ha.d/authkeys root@ha2:/etc/ha.d/ ssh root@ha2 "chmod 600 /etc/ha.d/authkeys" ssh root@ha2 "/etc/init.d/heartbeat start"
It should be noted that unless you have a different editor set via the "EDITOR" environment variable, the cluster resource manager will use vim as its editing environment. If you would prefer to use nano instead, you may set this permanently by issuing the following commands on both Linodes:
export EDITOR=/bin/nano echo "export EDITOR=/bin/nano" >> .bashrc
For the purposes of these instructions, it will be assumed that you are are using vim as your editor. On the primary Linode, issue the following command to start the cluster resource manager in "edit" mode:
crm configure edit
You will be presented with information resembling the following. If you don't see anything, enter ":q" to quit the editor and wait a minute before restarting it.
node $id="b4956911-457e-41eb-95cb-9d302149d386" ha1
node $id="f6d3d9ee-50a6-484f-a5a7-a08b3f78501d" ha2
property $id="cib-bootstrap-options" \
dc-version="1.1.1-972b9a5f68606f632893fceed658efa085062f55" \
cluster-infrastructure="Heartbeat"
To begin editing your configuration, press the "i" key. To leave edit mode, press "Ctrl+c". To quit without saving any changes, press ":" and enter "q!". To save changes and quit, press ":" and enter "wq".
Insert the following lines in between the second "node" line at the top of the configuration and the "property" line at the bottom. Important: Be sure to replace both instances of 55.55.55.55 with the "floating" IP address used earlier.
primitive apache2 lsb:httpd \
op monitor interval="5s"
primitive ip1 ocf:heartbeat:IPaddr2 \
params ip="55.55.55.55" nic="eth0:0"
primitive ip1arp ocf:heartbeat:SendArp \
params ip="55.55.55.55" nic="eth0:0"
group WebServices ip1 ip1arp apache2
order arp_after_ip inf: ip1:start ip1arp:start
order web_after_ip inf: ip1arp:start apache2:start
colocation ip_with_arp inf: ip1 ip1arp
colocation web_with_ip inf: apache2 ip1
Change the "property" section to resemble the following excerpt. You'll be adding an "expected-quorum-votes" entry due to the fact that your cluster only has two nodes, as well as adding the lines for "stonith-enabled" and "no-quorum-policy". Don't forget the trailing "\" after the "cluster-infrastructure" line.
property $id="cib-bootstrap-options" \
dc-version="1.1.1-972b9a5f68606f632893fceed658efa085062f55" \
cluster-infrastructure="Heartbeat" \
expected-quorum-votes="1" \
stonith-enabled="false" \
no-quorum-policy="ignore"
Add the following excerpt after the "property" section:
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
Your complete configuration should resemble the following:
node $id="b4956911-457e-41eb-95cb-9d302149d386" ha1
node $id="f6d3d9ee-50a6-484f-a5a7-a08b3f78501d" ha2
primitive apache2 lsb:httpd \
op monitor interval="5s"
primitive ip1 ocf:heartbeat:IPaddr2 \
params ip="55.55.55.55" nic="eth0:0"
primitive ip1arp ocf:heartbeat:SendArp \
params ip="55.55.55.55" nic="eth0:0"
group WebServices ip1 ip1arp apache2
order arp_after_ip inf: ip1:start ip1arp:start
order web_after_ip inf: ip1arp:start apache2:start
colocation ip_with_arp inf: ip1 ip1arp
colocation web_with_ip inf: apache2 ip1
property $id="cib-bootstrap-options" \
dc-version="1.1.1-972b9a5f68606f632893fceed658efa085062f55" \
cluster-infrastructure="Heartbeat" \
expected-quorum-votes="1" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
After making these changes, press "Ctrl+c" and enter ":wq" to save the configuration and exit the editor.
On the primary Linode, issue the commnd crm_mon to start the cluster monitor. You'll see output resembling the following:
============
Last updated: Tue Jun 29 13:32:08 2010
Stack: Heartbeat
Current DC: ha2 (f6d3d9ee-50a6-484f-a5a7-a08b3f78501d) - partition with quorum
Version: 1.1.1-972b9a5f68606f632893fceed658efa085062f55
2 Nodes configured, 1 expected votes
1 Resources configured.
============
Online: [ ha1 ha2 ]
Resource Group: WebServices
ip1 (ocf::heartbeat:IPaddr2): Started ha1
ip1arp (ocf::heartbeat:SendArp): Started ha1
apache2 (lsb:httpd): Started ha1
At this point, you should be able to visit your highly available site in your web browser. Depending on which node your resources were initially started on, you'll see the test page hosted on either ha1 or ha2. In this example, the clustered resources are started on ha1. To move them to ha2, issue the following command on either node:
crm resource move WebServices ha2
Within a few seconds, the resources will be stopped on the initial node and started on the other one. You can confirm this by refreshing your test site in your browser. To move the resources back to ha1, simply issue the following command:
crm resource move WebServices ha1
At this point, you should be able to shut down the Linode hosting your resources and watch them automatically migrate to the other Linode (provided you have crm_mon running in a terminal on the still-active Linode). Note that because "resource-stickiness" is set at "100", resources will stay wherever you manually migrated them they until you manually move them to another node. This can be helpful in cases where you need to perform maintenance on a node, but don't want services resuming on it until you're ready. Congratulations, you've successfully implemented a basic web services high availability configuration!
You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.
This guide is licensed under a Creative Commons Attribution-NoDerivs 3.0 United States License.
Last edited by Phil Paradis on Tuesday, May 17th, 2011 (r1919).