|
Discussion Forums
|
Thread: Automatic failover with heartbeat and an Elastic IP
 |
This thread has been recommended to become a knowledge base document
|
|
|
|
Replies:
3
-
Pages:
1
-
Last Post:
Aug 11, 2009 12:06 AM
by: sohail1
|
|
|
Posts:
2
Registered:
1/7/09
|
|
|
|
Automatic failover with heartbeat and an Elastic IP
Posted:
Jan 7, 2009 9:59 PM PST
|
|
|
Hello! Since I could not find this elsewhere on the internet, I thought I would post the solution that we came up with for doing automatic failover using heartbeat and elastic IPs. It is not perfect, as elastic IPs do not seem to propagate instantly, but it is a step in the right direction for high availability.
The first trick is that you need to make an init script (my example below) which sets up the elastic IP. It has to provide start, stop, and status -- as per:
http://www.linux-ha.org/resource
From there, you make this "resource" (heartbeat speak) something that is acquired by your heartbeat setup, via the haresources file:
http://www.linux-ha.org/haresources
The next trick is that you need to use the ucast option (in the heartbeat config), to talk to the peer, since broadcasting, and multicasting, are not allowed on EC2.
Then it is just a matter of setting up heartbeat on the nodes that you want to do failover.
This was tested on Ubuntu 8.10.
Here are my example configs:
/etc/ha.d/ha.cf
- copy to both machines, but update the ucast ip to the opposite peer
--
logfacility daemon # Log to syslog as facility "daemon"
node domU-XX-XX-XX-XX domU-XX-XX-XX-XX # List our cluster members
keepalive 1 # Send one heartbeat each second
deadtime 10 # Declare nodes dead after 10 seconds
ucast eth0 10.254.18.XXX # internal IP of the peer
auto_failback no # Don't fail back automatically
--
/etc/ha.d/haresources
- copied to both machines as is
--
domU-XX-XX-XX-XX elastic-ip
--
/etc/ha.d/authkeys
- copied to both machines
--
auth 1
1 sha1 yourSecretKey
--
/etc/init.d/elastic-ip
- depends on the "aws" script available @
http://timkay.com/aws/ .. this definitely needs work to check for errors on the api calls, but it does the trick for testing. You will need to copy this to both nodes, and set MY_ID to the instance id of the local machine.
--
#!/bin/bash
MY_ID="i-XXXXXXX" # different for each node
ELASTIC_IP="174.129.253.XXX"
case $1 in
start)
aws associate-address "$ELASTIC_IP" -i "$MY_ID" > /dev/null
echo $0 started
;;
stop)
aws disassociate-address "$ELASTIC_IP" > /dev/null
echo $0 stopped
;;
status)
aws describe-addresses | grep "$ELASTIC_IP" | grep "$MY_ID" > /dev/null
# grep will return true if this ip is mapped to this instance
[ $? -eq 0 ] && echo $0 OK || echo $0 FAIL
;;
esac
--
To test, start heartbeat on both servers, making the adjustments to the configs and init script as described. The preferred-node will assume the elastic ip, and if you interrupt service on that host (like /etc/init.d/heartbeat stop), the elastic ip will be remapped to the secondary node! Pretty cool!
Hope this helps somebody.
-Alex Polvi
Co-Founder, Overcast.me
|
|
Posts:
1
Registered:
7/28/09
|
|
|
|
Re: Automatic failover with heartbeat and an Elastic IP
Posted:
Jul 28, 2009 7:15 AM PDT
in response to: A. Polvi
|
|
|
Thanks this has been a great help.
One issue; when I start heartbeat on my primary server that already has the Elastic IP address I get these "CRITICAL" errors. I don't think there really is any problem but interested if you know how to deal with this?
root@am3:/etc/ha.d/resource.d# /etc/init.d/heartbeat start
Starting High-Availability services:
2009/07/28_13:57:02 CRITICAL: Resource elastic-ip is active, and should not be!
2009/07/28_13:57:02 CRITICAL: Non-idle resources can affect data integrity!
2009/07/28_13:57:02 info: If you don't know what this means, then get help!
2009/07/28_13:57:02 info: Read the docs and/or source to /usr/share/heartbeat/ResourceManager for more details.
CRITICAL: Resource elastic-ip is active, and should not be!
CRITICAL: Non-idle resources can affect data integrity!
info: If you don't know what this means, then get help!
info: Read the docs and/or the source to /usr/share/heartbeat/ResourceManager for more details.
2009/07/28_13:57:02 CRITICAL: Non-idle resources will affect resource takeback!
2009/07/28_13:57:02 CRITICAL: Non-idle resources may affect data integrity!
Done.
|
|
Posts:
5,320
Registered:
3/19/07
|
|
|
|
Re: Automatic failover with heartbeat and an Elastic IP
Posted:
Jul 31, 2009 1:46 AM PDT
in response to: A. Polvi
|
|
|
someone needs to figure out a way to nip this problem in the bud.
|
|
Posts:
2
Registered:
7/31/09
|
|
|
|
Re: Automatic failover with heartbeat and an Elastic IP
Posted:
Aug 11, 2009 12:06 AM PDT
in response to: allman2009
|
|
|
IP is the primary protocol in the Internet Layer of the Internet Protocol Suite and has the task of delivering distinguished protocol datagrams (packets) from the source host to the destination host solely based on their addresses. For this purpose the Internet Protocol defines addressing methods and structures for datagram encapsulation. The first major version of addressing structure, now referred to as Internet Protocol Version 4 (IPv4)
cheap hosting
is still the dominant protocol of the Internet, although the successor, Internet Protocol Version 6 (IPv6) is being deployed actively worldwide.The design principles of the Internet protocols assume that the network infrastructure is inherently unreliable at any single network element or transmission medium and that it is dynamic in terms of availability of links and nodes. No central monitoring or performance measurement facility exists that tracks or maintains the state of the network
file recovery
. For the benefit of reducing network complexity, the intelligence in the network is purposely mostly located in the end nodes of each data transmission, cf. end-to-end principle. Routers in the transmission path simply forward packets to next known local gateway matching the routing prefix for the destination address.In addition to issues of reliability, this dynamic nature and the diversity of the Internet and its components provide no guarantee that any particular path is actually capable of, or suitable for performing the data transmission requested, even if the path is available and reliable. One of the technical constraints is the size of data packets allowed on a given link. An application must assure that it uses proper transmission characteristics. Some of this responsibility lies also in the upper layer protocols between application and IP. Facilities exist to examine the maximum transmission
backup
unit (MTU) size of the local link, as well as for the entire projected path to the destination when using IPv6. The IPv4 internetworking layer has the capability to automatically fragment the original datagram into smaller units for transmission. In this case, IP does provide re-ordering of fragments delivered out-of-order.
|
|
|
|