Monday, January 28, 2008

Load Balancing and High Availablity with Ultra Monkey

This post is heavily borrowed from the following sources:



For instructions on installing the pre-requisites: http://linux-ha.org/download

Introduction


One of the wonders of working with a server product is answering the questions: How do we load balance? How can we make the server highly available? How can we make the load balancers highly available? After some investigation, we settled on a set of Linux utility wrappers collectively called 'Ultra Monkey'. First thing is first, what is Ultra Monkey?


Ultra Monkey reduces the complexity of using these utilities to configuration through two packages: LDirectorD (Linux Director Daemon) and Heartbeat. In essence, Ultra Monkey gives a software based way of achieving load balancing and high availability without needing to configure switches and routers, which pleases me no end.


Time for a little background; we are trying to get to a state where Spring Ring can be run with multiple nodes. Spring Ring is essentially a high level API for SIP protocols, so when deployed will include the application built on top of the APIs. The first problem to solve is to get the nodes load balanced, the first image shows the overview for how we set up our servers.


Before we start, the application nodes need to have their default gateway stated as the load balancer IP or the virtual IP, depending on single load balance or high availability solution. This may hinder some set ups, as essentially the nodes need to be on the same subnetwork mask as the default gateway.

Kernel Interaction


To get some fundamentals working, some kernel options need to be set. Edit the file /etc/sysctl.conf and add the following options to the file:



# Enables packet forwarding

net.ipv4.ip_forward = 1

# Stops stickiness being an issue

net.ipv4.vs.expire_nodest_conn = 1



Running the following command will boot strap these options:


/sbin/sysctl -p


We'll also want to make sure this is run when booting up the OS, so edit /etc/init.d/boot.local and add the '/sbin/sysctl -p' command to the bottom of the file. This will ensure this state is restored after a restart.


Starting with Load Balancing



The only change required on the application node is setting the default gateway to be that of the load balancer, we used YAST for this. We also needed to change the contact header on our SIP messages to that of the load balancer as well, so our non-transactional SIP messages go through the load balancer instead of being 'sticky'. This results in us now being in a position to send HTTP and SIP requests to the load balancer instead of the nodes themselves and for them to be able to respond. As we're also using shared state we can now also take a node out of the cluster and still see the application as a whole still continue to run.


Configuring Linux Director Daemon


Both single node and multinode rely on ldirectord to do it's stuff. ldirectord is essentially a wrap for Linux Virtual Services utilities and allows IP routing. It can be run as a resource (we'll use this later through Heartbeat), or on the command line itself. For a single node environment, we'll be using the command line. Our configuration is set up as such. Two application nodes, each with a HTTP and SIP service. The HTTP service will do a HTTP GET on the given URI and do a regular expression to validate the status of the node. HTTP is set up as TCP on round robin. The SIP service is sent a SIP OPTIONS messge to validate the status, and is also set up as round robin although this time using UDP protocol. This configuration file is located in /etc/ha.d/ldirectord.cf

# timeout for real server response. if expired the server is considered not available
checktimeout=1
# how often to check the real server for availability
checkinterval=1
# set to 'yes' will reload cf file after change
autoreload=yes
# where to log files (only if started in non deamon mode)
logfile="/var/log/ldirectord.log"
# set to yes to set the virtual server weight to 0 for non available servers
# (as opposite to deleting them from the lvs virtual table) - see man
# for side effect on client connection persistency
quiescent=no
# http virtual service
virtual=10.0.0.150:9072
# first real server ip address, port, forwarding mecahnism, weight. masq -> NAT
real=10.0.0.147:9072 masq 1
# second real server ip address, port, forwarding mecahnism, weight. masq -> NAT
real=10.0.0.146:9072 masq 1
# type of service
service=http
# where to send http request for checking availability
request="/SpringRing/status"
# what to match on the response to verify the server to be available
receive="Sample SpringRing web application"
# type of scheduling - rr: round robin
scheduler=rr
# type of protocol for this service
protocol=tcp
# type of check: negotiate means send request and verify response (see man for others)
checktype=negotiate
# udp/sip virtual service
virtual=10.0.0.150:7072
# first real server ip address, port, forwarding mecahnism, weight. masq -> NAT
real=10.0.0.147:7072 masq 1
# second real server ip address, port, forwarding mecahnism, weight. masq -> NAT
real=10.0.0.146:7072 masq 1
# type of service
service=sip
# type of shceduling
scheduler=rr
# type of protocol
protocol=udp
# type of check
checktype=negotiate
# client connection pesistency timeout
persistent=1

Getting a single node to work now requires starting ldirectord by using the following command:


/etc/init.d/ldirectord start


Now any requests sent to the load balancer will be forwarded, round robin, to the application nodes. The status messages happen automatically and any removal of an application node will change the routing table which can be viewed by the following command:


ipvsadm -L -n

IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
UDP 10.0.0.150:7072 rr persistent 1
-> 10.0.0.147:7072 Masq 1 0 0
-> 10.0.0.146:7072 Masq 1 0 0
TCP 10.0.0.150:9072 rr
-> 10.0.0.147:9072 Masq 1 0 0
-> 10.0.0.146:9072 Masq 1 0 0




High Availability



Moving on to high availability for the load balancers themselves, this is where the heartbeat comes in. We have to turn ldirectord off and turn heartbeat on, on both load balancers. In the configuration, one of the load balancers will be set as the primary. Each application node will need to set it's default gateway to the virtual IP address, as both load balancers will use this whilst being the master load balancer.


To check which node is currently running as master you can type the following command.


# On the master linux-director
/etc/ha.d/resource.d/LVSSyncDaemonSwap master status
master running


# On the stand-by linux-director
/etc/ha.d/resource.d/LVSSyncDaemonSwap master status
master stopped


Our ha.cf file looks like this:



# What interfaces to broadcast heartbeats over?
mcast eth0 225.0.0.1 694 1 0
# auto_failback: determines whether a resource will
# automatically fail back to its "primary" node, or remain
# on whatever node is serving it until that node fails, or
# an administrator intervenes.
auto_failback off
# Tell what machines are in the cluster
# node nodename ... -- must match uname -n
node AppNode1
node AppNode2


Our /etc/ha.d/haresources looks like this:



AppNode1 \
LVSSyncDaemonSwap::master \
ldirectord::ldirectord.cf \
IPaddr2::10.0.0.154/25/eth0


It is paramount that the /etc/ha.d/haresouces file is identical on each load balancer. This identifies the master load balancer, but you can see in our /etc/ha.d/ha.cf file we have configured auto failback to false. This ensures when the master load balancer goes down, it will not re-take the master status unless over ridden by an administrator. You would most likely want this behaviour so any investigations could be carried out to the failure.


We also use the example /etc/ha.d/authkeys file from the Ultra Monkey site. The authkeys file will also need to be protected, and this can be achieved through the following command:


chmod 600 /etc/ha.d/authkeys


We now also have to change the ldirectord config to point to the virtual IP address mentioned in the ha.cf configuration. Change the virtual IP address line to reflect the change.


Having a Play



Now everything is set up. Start heartbeat on each load balancer by using the following commands, making sure ldirectord has also stopped.



/etc/init.d/ldirectord stop

/etc/init.d/heartbeat start


After a few moments you can run the "who's the master?" command to see which load balancer is active.


/etc/ha.d/resource.d/LVSSyncDaemonSwap master status


The load balancer will also have started ldirectord, which you can see if you use the following commands:


ps -ef|grep ldirectord


The above to see the process running and the below to see what routes the load balancer is currently letting through.


/sbin/ipvsadm -L -n


This command will display what protocols and servers packets are being forwarded too. In our example here you will see two TCP and two UDP connections, one each to each application node.


If you now 'take down' an application node, you will that the master load balancer will update it's packet forwarding table, again you can see this by using the following command.


/sbin/ipvsadm -L -n


Failover of high availability can also be simulated by stopping heartbeat on the master load balancer.


/etc/init.d/heartbeat stop


This will result in the secondary load balancer taking ownership of the virtual IP address. This can be monitored through the following command:


ip addr sh


Looking back



What does this achieve exactly? It's a fairly easy to configure, software based load balancing and high availability solution. I won't hold back, it took a few of us more then a few hours to get this all right and if you try this, your environment will almost certainly be different. Hopefully this will guide you in the right direction though.

Sunday, January 27, 2008

The Ubuntu Problem

...is that I think I've accepted it as my default OS now days but it'd causing me problems that I don't want to have, and they are mainly work ones.

I've been using Ubuntu at home now for several months and am very happy with it. The only exception is that I can't watch Arsenal highlights on http://tv.arsenal.com, damm content protection + WMV crap! That means my 'media' machine is Windows; it's the computer with my video and music which is also hooked into my living room TV (sweet!).

However work is another problem. My work machine is a HP Compaq NW9440, a monster of a laptop, and these are the issues I'm facing:

1) Graphics card seems unsupported - this results in the computer freezing from time to time, resulting in the need for a restart (feels like Windows 95 again sometimes). This isn't a problem on my personal Ubuntu based laptop.

2) Microsoft Exchange, mail server - this is used at work, and let's be honest, Outlook is the best companion for Exchange, and the webmail through Firefox sucks. It requires a mixture of Evolution mail client, web mail and Outlook in a virtual. All compromises. No GMail for work unfortunately.

3) No Linux VPN client - means I have to use a virtual to VPN into my work network, again sucks.

4) When I plug a projector/monitor into the laptop, the whole thing barfs, needing a restart. I need to give a presentation next week and I can't do it using this machine - will I need to partition and install Windows just to do a presentation?

5) Livemeeting and Windows Conferencing - both heavily used amongst peers, neither work on Linux.

I want Ubuntu to work for me at work, but I'm struggling at the moment.

Tuesday, January 08, 2008

Using custom rake tasks in your Rails build

With Rails you can create custom Rake tasks (I recommend reading RailsEnvy for a tutorial for rake tasks).

Once you've got your task you can use it in your Rails build by adding the following line to your Rakefile in the Rails base directory:

Rake::Task['namespace:your_rake_task'].invoke

Easy enough, but took me long enough to find it! Now when you run 'rake' in your rails projects, your custom tasks will be executed. Cool eh? Good for copying dependences and such.

Monday, January 07, 2008

My New Years Resolutions

I know it's a little late, but better late then never eh?

My new years resolution is simple really: be more selfish. In the nicest possible way of course. Over the last few months I've found myself being online a lot, in a sort of attached-to-the-hip sort of way.

It really came to me on Saturday when I was sitting on the sofa, enjoying a cup of tea and reading the Guardian. I paused and thought to myself, "this is nice, it's been too long since I just sat and thoroughly read the paper". I went on to think about when I used to go away for the weekend hiking, or mountain biking, or how I used to actually go to the gym regularly, play football and generally be a lot fitter then I am now.

I kept found myself wanting to check my email, Twitter and my RSS reader to see what was going on and thinking back, it seems nothing that I didn't miss that much after all.

Here's an interesting twittersation for you, which left me thinking my new years resolution is to spend less time online, and spend more time doing stuff I used to like doing. (This coming from a man who just wrote four blog posts in an hour!)

To do that, I may have to cull my RSS reader and unfollow people on Twitter, so sorry about that in advance. Please add me to your Google address book though if you use Google Reader, the shared items is a great filter of information.

http://twitter.com/robb1e/statuses/572002322
http://twitter.com/FND/statuses/572157652
http://twitter.com/jayfresh/statuses/572158492
http://twitter.com/FND/statuses/572238092
http://twitter.com/robb1e/statuses/572686122

Battle for your TV - Round 2

*ding ding*

A very interesting development in the battle for your TV, BT and Microsoft jointly announce that BT will provide content through xBox Live, check out the press release. It's a funny old world, only last month I wrote up some thoughts on the battle for your TV, BT are playing an interesting game. There is a bit of a - however - moment though...

Here it comes...

However, a resident has to be a BT Broadband customer, and I assume a BT Total Broadband customer so that the resident is using a BT Home Hub (the snazzy Apple looking router) as it guarantees some quality of the line. I can only assume this means that the payment still appears on the bill you get from BT. There doesn't appear to be the 'roaming' option of being signed into any xBox using your Live account. So you couldn't be round a buddies house who uses broadband X rather then BT and opt in for a movie if they've got an xBox. They could however purchase movies from xBox Lives own pay per view service.

All in all, this only suggests that you don't need to get the BT Vision set top box (or you could use it in another room if you've got a long enough network cable) if you have the xBox. That would still leave you needing to buy a hard disc based DVRif you don't already have one.

It's interesting reading some of the comments about this. It's certainly an interesting move, still a bit restrictive but it could open up to thousands more people using the BT Vision 'platform' or what ever it is at the local exchange. It does feel fairly cosmetic though.

Why is eGovernment not working?

I'm simply amazed at the quote that £2bn have been wasted on binned IT government projects. Working for a large company, I get the whole red-tape, binned projects thing, but this to me cries out of people who think that IT can solve all problems. Any lack of understanding of the issues combined with the big bang approached that is suggested in implemented and deploying these applications is doomed to fail.

I find it quite a difficult pill to swallow. I want to believe in the government, and I know technology has it's place to make things easier and more efficient. If the government were a technology company, it would surely have gone bust by now.

I think the answer may lay in a series of changes, namely: stop the big budget projects, go for small wins, build up with lots of small decoupled applications and learn from OpenID, O-Auth and all the small start ups who develop lean applications and get early adoption to have a short feedback/improvement cycle.

How can I trust the national ID card, the database nation we're becoming when the government can't build or buy IT systems?

Quote of the Day

"You campaign in poetry, but you govern in prose"
- Mario Cuomo, Democratic Governor, New York, 1983 - 1995

I love this quote, I read it in the weekend's Guardian relating to Barack Obama's use of words in his speeches leading up to the recent Iowa cacus.

For me, this is saying there is a time for poetry, inspiration, and a time for leadership and pragmatism. Or: get people excited, and then deliver!

Wednesday, January 02, 2008

Spamming on Facebook

I thought I'd do my good deed for the day and post on the 'Arsenal Football Club' group on Facebook about the Twitter app/user I created to update according to the latest scores and final result.

I started a discussion, trying to share the app, explaining about Twitter and how everyone could benefit from free SMS updates. That was my good deed.

The first response was telling me that I shouldn't be spamming. I was somewhat shocked at this and of course, sent a Tweat. I thought I was sharing not spamming, and a little later the discussion was deleted.

I looked at the definition of Forum Spam on Wikipedia, certainly I may be guilty of trying to raise awareness of something away from the group/discussion, but I'm certainly no spambot and I thought I'd be doing some people a favour to highlight a really useful free product.

It made me think about Facebook, advertising and spamming in general. There's 20 odd thousand members on this Arsenal group and I presume most of them don't mind that Facebook puts ads on every page, or that their friends send them loads of crappy vampire bites, but they mind if a stranger posts something.

I just find it disappointing. When someone offers something for free and they are paying out of their own pocket (hosting isn't free people) that they get tainted with the word 'spam' when they feel like sharing something they think people may find useful.