On http load testing

A lot of people seem to be talking about and performing load tests on HTTP servers, perhaps because there’s a lot more choice of servers these days.

That’s great, but I see a lot of the same mistakes being made, making the conclusions doubtful at best. Having spent a fair amount of time benchmarking high-performance proxy caches and origin servers for my day job, here are a few things that I think are important to keep in mind.

It’s not the final word, but hopefully it’ll help start a discussion.

0. Consistency.
The most important thing to get right is to test the same time, every time. Any changes in the system – whether its an OS upgrade or another app running and stealing bandwidth or CPU – can affect your test results, so you need to be aggressive about nailing down the test environment.

While it’s tempting to say that the way to achieve this is to run everything on VMs, I’m not convinced that adding

another layer of abstraction (as well as more processes running on the host OS) is going to lead to more consistent results. Because of this, dedicated hardware is best. Failing that, just run all of the tests you can in one session, and make it clear that comparisons between different sessions don’t work.

1. One Machine, One Job.
The most common mistake I see people making is benchmarking a server on the same box where the load is generated. This doesn’t just put your results out a little bit, it makes them completely unreliable, because the load generator’s “steal” of resources varies depending on how the server handles the load, which depends on resource availability.

The best way to maintain consistency is to have dedicated, separate hardware for the test subject and load generator, and to test on a closed network. This isn’t very expensive; you don’t need the latest-and-greatest to compare apples to apples, it just has to be consistent.

So, if you see someone saying that they benchmarked on localhost, or if they fail to say how many boxes they used to generate and serve the load, ignore the results; at best they’ll only be a basic indication, and at the worst the’ll be very misleading.

2. Check the Network.
Before each test, you need to understand how much capacity your network has, so that you’ll know when it’s limiting your test, rather than the server your’e testing.

One way to do this is with iperf:

Qa1:~> iperf – c qa2
– – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
Client connecting to qa2, TCP port 5001
TCP window size: 16.0 KByte (default)
– – – – – – – – – – – – – – – – – – – – – – – – – – – – – –
[ 3] local 192.168.1.106 port 56014 connected with 192.168.1.107 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 1.10 GBytes 943 Mbits/sec

… which shows that I have about 943 Mbits a second available on my Gigabit network (it’s not 1,000 because of TCP overheads).

Once you know the bandwidth available, you need to make sure that it isn’t a limiting factor. There are a number of ways to do this, but the easiest is to use a tool that keeps track of the traffic in use. For example, httperf shows bandwidth use like this:

Net I/O: 23399.7 KB/s (191.7*10^6 bps)

… which tells me that I’m only using about 192 Mbits of my Gigabit in this test.



On http load testing