Monday, January 28, 2008

Load Balancing and High Availablity with Ultra Monkey

This post is heavily borrowed from the following sources:



For instructions on installing the pre-requisites: http://linux-ha.org/download

Introduction


One of the wonders of working with a server product is answering the questions: How do we load balance? How can we make the server highly available? How can we make the load balancers highly available? After some investigation, we settled on a set of Linux utility wrappers collectively called 'Ultra Monkey'. First thing is first, what is Ultra Monkey?


Ultra Monkey reduces the complexity of using these utilities to configuration through two packages: LDirectorD (Linux Director Daemon) and Heartbeat. In essence, Ultra Monkey gives a software based way of achieving load balancing and high availability without needing to configure switches and routers, which pleases me no end.


Time for a little background; we are trying to get to a state where Spring Ring can be run with multiple nodes. Spring Ring is essentially a high level API for SIP protocols, so when deployed will include the application built on top of the APIs. The first problem to solve is to get the nodes load balanced, the first image shows the overview for how we set up our servers.


Before we start, the application nodes need to have their default gateway stated as the load balancer IP or the virtual IP, depending on single load balance or high availability solution. This may hinder some set ups, as essentially the nodes need to be on the same subnetwork mask as the default gateway.

Kernel Interaction


To get some fundamentals working, some kernel options need to be set. Edit the file /etc/sysctl.conf and add the following options to the file:



# Enables packet forwarding

net.ipv4.ip_forward = 1

# Stops stickiness being an issue

net.ipv4.vs.expire_nodest_conn = 1



Running the following command will boot strap these options:


/sbin/sysctl -p


We'll also want to make sure this is run when booting up the OS, so edit /etc/init.d/boot.local and add the '/sbin/sysctl -p' command to the bottom of the file. This will ensure this state is restored after a restart.


Starting with Load Balancing



The only change required on the application node is setting the default gateway to be that of the load balancer, we used YAST for this. We also needed to change the contact header on our SIP messages to that of the load balancer as well, so our non-transactional SIP messages go through the load balancer instead of being 'sticky'. This results in us now being in a position to send HTTP and SIP requests to the load balancer instead of the nodes themselves and for them to be able to respond. As we're also using shared state we can now also take a node out of the cluster and still see the application as a whole still continue to run.


Configuring Linux Director Daemon


Both single node and multinode rely on ldirectord to do it's stuff. ldirectord is essentially a wrap for Linux Virtual Services utilities and allows IP routing. It can be run as a resource (we'll use this later through Heartbeat), or on the command line itself. For a single node environment, we'll be using the command line. Our configuration is set up as such. Two application nodes, each with a HTTP and SIP service. The HTTP service will do a HTTP GET on the given URI and do a regular expression to validate the status of the node. HTTP is set up as TCP on round robin. The SIP service is sent a SIP OPTIONS messge to validate the status, and is also set up as round robin although this time using UDP protocol. This configuration file is located in /etc/ha.d/ldirectord.cf

# timeout for real server response. if expired the server is considered not available
checktimeout=1
# how often to check the real server for availability
checkinterval=1
# set to 'yes' will reload cf file after change
autoreload=yes
# where to log files (only if started in non deamon mode)
logfile="/var/log/ldirectord.log"
# set to yes to set the virtual server weight to 0 for non available servers
# (as opposite to deleting them from the lvs virtual table) - see man
# for side effect on client connection persistency
quiescent=no
# http virtual service
virtual=10.0.0.150:9072
# first real server ip address, port, forwarding mecahnism, weight. masq -> NAT
real=10.0.0.147:9072 masq 1
# second real server ip address, port, forwarding mecahnism, weight. masq -> NAT
real=10.0.0.146:9072 masq 1
# type of service
service=http
# where to send http request for checking availability
request="/SpringRing/status"
# what to match on the response to verify the server to be available
receive="Sample SpringRing web application"
# type of scheduling - rr: round robin
scheduler=rr
# type of protocol for this service
protocol=tcp
# type of check: negotiate means send request and verify response (see man for others)
checktype=negotiate
# udp/sip virtual service
virtual=10.0.0.150:7072
# first real server ip address, port, forwarding mecahnism, weight. masq -> NAT
real=10.0.0.147:7072 masq 1
# second real server ip address, port, forwarding mecahnism, weight. masq -> NAT
real=10.0.0.146:7072 masq 1
# type of service
service=sip
# type of shceduling
scheduler=rr
# type of protocol
protocol=udp
# type of check
checktype=negotiate
# client connection pesistency timeout
persistent=1

Getting a single node to work now requires starting ldirectord by using the following command:


/etc/init.d/ldirectord start


Now any requests sent to the load balancer will be forwarded, round robin, to the application nodes. The status messages happen automatically and any removal of an application node will change the routing table which can be viewed by the following command:


ipvsadm -L -n

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
UDP 10.0.0.150:7072 rr persistent 1
-> 10.0.0.147:7072 Masq 1 0 0
-> 10.0.0.146:7072 Masq 1 0 0
TCP 10.0.0.150:9072 rr
-> 10.0.0.147:9072 Masq 1 0 0
-> 10.0.0.146:9072 Masq 1 0 0




High Availability



Moving on to high availability for the load balancers themselves, this is where the heartbeat comes in. We have to turn ldirectord off and turn heartbeat on, on both load balancers. In the configuration, one of the load balancers will be set as the primary. Each application node will need to set it's default gateway to the virtual IP address, as both load balancers will use this whilst being the master load balancer.


To check which node is currently running as master you can type the following command.


# On the master linux-director
/etc/ha.d/resource.d/LVSSyncDaemonSwap master status
master running


# On the stand-by linux-director
/etc/ha.d/resource.d/LVSSyncDaemonSwap master status
master stopped


Our ha.cf file looks like this:



# What interfaces to broadcast heartbeats over?
mcast eth0 225.0.0.1 694 1 0
# auto_failback: determines whether a resource will
# automatically fail back to its "primary" node, or remain
# on whatever node is serving it until that node fails, or
# an administrator intervenes.
auto_failback off
# Tell what machines are in the cluster
# node nodename ... -- must match uname -n
node AppNode1
node AppNode2


Our /etc/ha.d/haresources looks like this:



AppNode1 \
LVSSyncDaemonSwap::master \
ldirectord::ldirectord.cf \
IPaddr2::10.0.0.154/25/eth0


It is paramount that the /etc/ha.d/haresouces file is identical on each load balancer. This identifies the master load balancer, but you can see in our /etc/ha.d/ha.cf file we have configured auto failback to false. This ensures when the master load balancer goes down, it will not re-take the master status unless over ridden by an administrator. You would most likely want this behaviour so any investigations could be carried out to the failure.


We also use the example /etc/ha.d/authkeys file from the Ultra Monkey site. The authkeys file will also need to be protected, and this can be achieved through the following command:


chmod 600 /etc/ha.d/authkeys


We now also have to change the ldirectord config to point to the virtual IP address mentioned in the ha.cf configuration. Change the virtual IP address line to reflect the change.


Having a Play



Now everything is set up. Start heartbeat on each load balancer by using the following commands, making sure ldirectord has also stopped.



/etc/init.d/ldirectord stop

/etc/init.d/heartbeat start


After a few moments you can run the "who's the master?" command to see which load balancer is active.


/etc/ha.d/resource.d/LVSSyncDaemonSwap master status


The load balancer will also have started ldirectord, which you can see if you use the following commands:


ps -ef|grep ldirectord


The above to see the process running and the below to see what routes the load balancer is currently letting through.


/sbin/ipvsadm -L -n


This command will display what protocols and servers packets are being forwarded too. In our example here you will see two TCP and two UDP connections, one each to each application node.


If you now 'take down' an application node, you will that the master load balancer will update it's packet forwarding table, again you can see this by using the following command.


/sbin/ipvsadm -L -n


Failover of high availability can also be simulated by stopping heartbeat on the master load balancer.


/etc/init.d/heartbeat stop


This will result in the secondary load balancer taking ownership of the virtual IP address. This can be monitored through the following command:


ip addr sh


Looking back



What does this achieve exactly? It's a fairly easy to configure, software based load balancing and high availability solution. I won't hold back, it took a few of us more then a few hours to get this all right and if you try this, your environment will almost certainly be different. Hopefully this will guide you in the right direction though.

11 comments:

mark said...

Good stuff. As a note I'd add that LVS also allows for a non-NAT setup called "Direct Routing" (DR), where packet forwarding is done on the Ethernet level (the director just changes the MAC address to the realserver's without touching IP, and drops the packet into the LAN).
This allows for higher performance and better scalability.

I covered this in more detail here (clusteradmin.net), if anybody is interested.

-marek

Robbie said...

thanks for the tip, anything to increase performance.

boby said...

hi robbie, nice post. i want to ask something.
in that topologies the director must be have same service with the real server or it just forward the sip request?
have you try it with asterisk voip server ?

boby said...

i'm sorry it did.n mention it clearly. i mean if i want to have load balancing service with Asterisk Voip of course the client must be registered with the server. thks

Robbie said...

Boby, the director is essentially a packet forwarding application and so doesn't need to be registered (not in the application we were using anyway, you may have some IP address restrictions in yours). It also doesn't need to be on the same box, it can send it anywhere.

Hope that helps

boby said...

tkhs rob, for the answer
i'm stil little confuse with the client doesn't need to be registerd with the real server. because usually when we start the VOip communication the client must sent SIP REGISTER to the server.
i try to used the ipvsadm to get along with asterisk sip server. i confgured the virtual service and the real service to work in port 5060 but it doesn't work. Form the client (using xlite softphone) it always failed to register with the server.

thks a lot for the attention robbie.

Robbie said...

Hi Boby,

Yes, if you're using REGISTER, it won't be the load balancer who registers but it'll be the client whose target endpoint is the load balancer, the REGISTER request will get forwarded to a server in the cluster.

I haven't used Asterix, so I don't know if you can cluster your registration sessions but you might get an issue if you're registered against server1, but the LB forwards subsequent requests to server2.

Malcolm Turnbull said...

Boby,
We have several customers at www.Loadbalancer.org load balancing VOIP & asterix and other technologies. You just need to enable persistence (sticky) in your LVS or ldirectord configuration. If you need to group multiple ports and UDP etc. then you can use firewall marks with persistence. Persistence simply ensures that all requests from one source IP address will always go to the same VOIP server in the cluster.

Robbie said...

From memory, you should only have to use sticky sessions for transactional requests, i.e. invite-ack-ok, once the session has been initiated any node can take responsibility for handing the rest of the call assuming it has access to the data source with the call leg information.

cHeY's said...

which is the best open source load balancer with high availability

Robbie said...

I haven't used these tools for a while, so can't comment on what the best tool is right now.