So at my new job, I get to use AWS stuff a lot. We have many, many servers, usually sitting behind a load-balancer of some sort. Amazon’s documentation on these things isn’t very clear, so I’m trying to figure out what the damned things are doing.
First off – a good thing. These Load Balancers are really easy to use. Adding a new instance is a few clicks away using the Console.
Another good thing – you can use multiple availability zones as a way to avoid trouble when an entire Availability Zone goes down (as happened less than a year ago). And here’s where it gets ugly.
It seems to evenly split traffic across zones – even if you don’t have a balanced number of instances in each zone. And it seems to determine which zone to hit based on some hash of the source address. So, if you have 2 instances in an Autoscale group, in two availability zones, and you hit your array from one IP really hard – bad things ensue. The one AZ you’re hitting will max out the CPU, and the AZ you’re not hitting will be nearly 100% idle. That adds up to 50% utilization on average – not enough to cause a scaling event (with my thresholds, anyway).
So, in short, if you have fojillions of people from all over hitting your services indiscriminately, I’m sure it’ll be fine. But, if like me, you have 1000’s of people (or so) from all over, some really hard from one particular IP – it may not be a good idea to try to spin up more than one AZ. And spinning up 4 AZ’s seems silly – you’ll definitely have more of a chance of at least one of them going bad, and until your load-balancer figures that out, you’ll have one out of ‘n’ requests failing.
Another thing I’ve noticed – the ELB’s seem to have a strict timeout on accessing the back-end service. If it doesn’t get a response within 30 seconds, it’s going to drop the connection and hit the service again. I had a service that was nearly getting DOS’ed by the load balancer that way. Make sure you have sane timeouts.
So the next thing I was curious about was whether or not the ELB would do any batching-together of any of the returned data – would it act like some kind of reverse-proxy? I wrote a tiny Node server which spat out a little data, waited 10 seconds, spat out some more, waited another 10 seconds, then finished. Here it is:
#!/usr/local/bin/node
"use strict";
var http=require('http');
http.createServer(function (req,res) {
console.warn("CONNECTION RECEVIED - waiting 10 seconds: "+req.url);
res.writeHead(200, {'Content-Type': 'text/plain'});
res.write("initial output...n");
setTimeout(function () {
console.warn("first stuff written for "+req.url);
res.write("Here is some stuff before an additional delayn");
setTimeout(function () {
console.warn("second stuff written for "+req.url);
res.end('Hello Worldn');
},10000);
},10000);
}).listen(80);