Redis Benchmarking on Amazon EC2, Flexiscale, and Slicehost
Much attention has been garnered by key/value databases in recent months, often stemming from the potential increase in throughput over traditional RDBMSs. Redis is one such database, and its support of data structures has especially captured my attention.
However, I have been concerned that some users have found in-memory databases (such as Redis) perform poorly on cloud hosting providers like Amazon EC2. Contention for memory bandwidth would seem a likely cause.
So I set out to compare cloud hosting providers and find some (reasonably) solid numbers. The full results can be seen in the Google Docs Spreadsheet, but I have extracted the key points below. I am not experienced at performing benchmarking, so any suggestions for improvement are very welcome!
Benchmarking information:
All benchmarks below were performed using the Redis benchmarking command as follows:
./redis-benchmark -n 10000 -d 200
I ran Ubuntu 8.04 LTS on every server, with the 64 bit version used for all tests except ‘small-remote’, ‘small’, and ‘high-cpu-extra-large-32bit-os’ (which used the 32 bit version).
Raw throughput for one Redis instance

This first chart shows that, by-and-large, the performance of the Redis instance is dependant upon the speed of the core on which it runs. The ‘large’ EC2 server is about twice the speed of the ‘small’ EC2 instance (which Amazon states), and the double/quadruple-extra-large servers are a little faster again.
Based on this theory, one may expect the difference in performance to be greater in some cases, but I suspect that there is some variability depending on the physical server the instance is actually deployed to, so take this as a rough guide only.
It is also interesting to note that the 32bit high-cpu instance outperformed the 64bit high-cpu instance in benchmarks using both 2byte and 200byte requests, potentially due to the extra resources needed to process and transfer 64bit values between memory and the CPU. However, running a 32bit OS on 64bit hardware showed appalling performance, so don’t try it.
It should also be noted that the ‘small’ and ‘small-remote’ servers are the same, but in the case of ‘small-remote’ the benchmarking utility was run on a separate ‘high-cpu-extra-large instance’ (as ‘small’ instances have only one core, the benchmarking utility was a significant drain on system resources, as shown by the differing performance).
Both Flexiscale servers performed equally well. Slicehost also performed well for a single core server (which also ran the benchmarking tool during the test). However, the result is of little significance as this was testing an in-memory database on a server with 256Mb memory. We essentially discount the Slicehost server later, but I would welcome any benchmarking data for larger slices.
Considering multiple Redis instances
Of course, many of these servers have multiple cores. In these cases, the intention is to run one Redis instance per core and use some form of load balancing to distribute requests. This will inevitably cause a reduction in performance depending on how you choose to do this (replication, sharding, consistent hashing, vertical partitioning etc etc), but this analysis does not take this into account. I expect the performance impact will greatly vary depending on your application’s architecture.
With that said and done, here is the chart of projected requests per second when running one Redis instance on each available core.

There are very few surprises here considering what we have learnt so far. If you have a ‘quadruple-extra-large’ server with 8 cores at about 3.5GHz each, you get massive throughput. The faster and more plentiful your cores, the more you can get done.
Bang for your buck – responses per second
Now here is the actual useful bit! Which server will give the best performance (in terms of requests per second) per dollar spent? To find this, we scale the data we have for each server so that it becomes a fictional server that costs $1/hour.

‘slicehost-256’ is very cheap and performs OK, so clearly does well here (don’t worry, it will fall down in the next test!). Behind this, the EC2 High CPU instances give good value, as do the Flexiscale servers. Any memory heavy server performs poorly here, as you are paying for the memory, not the core speed/quantity.
Bang for your buck – database size
Of course, responses per second is only half of it. Redis stores the entire database in memory, which limits how much data our servers can store. Firstly, lets look at the available memory in each server (irrespective of cost):
The ‘quadruple-extra-large’ server provides a whopping 63GB of memory (we subtract 1GB from each machine as the OS will need some memory), so that comes out on top. The ‘slicehost-256’ comes out as zero here as it is not a realistic choice for a Redis server.
Depending on how you intend to split you dataset across Redis instances, you will want to pay more attention to either ‘Memory available’, or ‘Memory per Redis instance’.
We can also consider this in terms of dollars per GB of database storage:

Most of the standard large servers do well here. ‘flexiscale-2gb-4core’ fairs badly, but this simply highlights that splitting 2GB across 4 Redis instances has skewed the server towards requests per second rather than storage.
Conclusion
It is easy to see how Redis can perform poorly on the cloud. Comparisons between a (locally benchmarked) EC2 small instance and a bare metal server just won’t hold up, but who runs a single core 1.1GHz server anyway? There will be a performance penalty for virtualisation, but that isn’t news. As always, the trick is to work out what is best for the situation.
For example, a ‘high-cpu-medium’ EC2 instance will be around $130/month and will serve 60k-70k requests per second for each of its two cores. Yes, your database will be limited to 1GB, but there is a very easy upgrade path. However, if you do need to store many gigabytes of data, then I expect that a dedicated server will be much more cost effective (considering the price of memory).
In terms of ‘bang for your buck’, it is nice to see parity between small and massive servers alike. In both price and performance, one EC2 ‘double-extra-large’ server is equivalent to about 15 small EC2 instances. So the small guys can take comfort that they are not losing out to economies of scale, but the big guys should probably be looking at dedicated bare-metal.
Request for Benchmarks
I would be especially interested in additional benchmarks for bare-metal servers, larger Slicehost slices, and any other IaaS providers not covered here.
The end of the “get it out there” culture?
A change seemed to be in the air at last week’s Future Of Web Apps (FOWA). I remember FOWA 07, the conference which convinced me to give up my 9-to-5 working week and embark on a life of freelancing. That conference featured high spirits, and a heady attitude of “just get your app out there – don’t sweat the details.”
This attitude is still present in 2009, but has now gained a hint of sobriety and learning. Yes, speakers still talk of pushing apps out quickly, but not at the expense preparedness and understanding.
Speakers have encouraged startups to ensure detailed traffic analytics are in place, to realise the difficulties of attracting (paying) customers, and, most importantly, make sure you are solving an actual problem. And coupled with this years downsized venue, the web startup world suddenly feels relatively sombre.
But why? Well, to name a but a few potential reasons: the global recession, a maturing industry, crowding of the web startup world, choice of speakers, or maybe I am just getting old. Whatever the cause, I feel change is in the air.
Why Google Wave Sucks, and why Wave Rocks
I have just posted an article onto the new Wavetastic blog:
Why Google Wave Sucks, and why Wave Rocks
I hope you find it interesting, there will be a lot more to come on the Wavetastic blog.
Batch conversion of PNG32 images to PNG8

In an effort to better support IE6, I have converted my copy of the famfamfam icon library from PNG32 images into PNG8 images. Here is how…
Adding currency conversion to Zend_Currency
I recently needed to do currency conversion for a Zend Framework project, so a naturally turned to Zend_Currency. Sadly, Zend_Currency doesn’t feature currency conversion, rather it focuses on the localization aspects of currency. This is perfectly understandable as offering conversion would make this component dependant on potentially unreliable third parties that Zend would not be able to support.
Read more »
Installing CouchDB on CentOS 5
I have been meaning to have a play with CouchDB for a while now, so this afternoon I finally had a go at installing it on my (32 bit) CentOS 5 box. Here is what I learnt along the way…
Read more »
The Applicator Design Pattern?
I have been learning more and more about Zend Framework over the past few months. The more I lean, the more I appreciate its loosely couple nature. For example, most ZF classes will work with minimal configuration out of the box, and any configuration that is necessary can be done by either passing in a configuration array (or Zend_Config object), or by using the object’s setter/getter methods.
PHP for Google App Engine in the works?
Ever since the launch of Google App Engine a little over two months ago there has been a lot of mumbling and grumbling over the lack of PHP support. I am sure that there are good reasons why Google chose to launch with Python but, even so, I expect that this raised several eyebrows (mine included) when App Engine was first released.
Feedback time! (plus some cool links)
I have been writing this blog for a bit over a month now and I thought that it would be good to get some feedback from the site’s readers – i.e. you folks!
Effective In-Function Caching With PHP5
At one stage or another most programmers have written some simple in-function caching. If you don’t know what I mean my in-function caching, here is a simple example:
Migrating Your Feeds to FeedBurner
One of the things that has been skulking around my todo list for some time now has been to start using FeedBurner to track subscriber statistics. I only have a very vague concept of how many people subscribe to this blog and it would be great to get a more accurate idea.
9 PHP Debugging Techniques You Should Be Using
Isn’t writing new code great? Wouldn’t the world be a better place if all were ever had to do is write software from scratch, not having to worry about methods of classes past? Unfortunately, we all know that this is not the case. In fact, estimates say that we spend around 80% of our programming time maintaining old code. So for this blog post will be trying to tackle that 80%, and I will see what I can do to make it less painful.
Future of Web Design 08 (and other things)
FOWD 2008 is now over, the contacts have been contacted and
the meetings made (well, planned, but it sounded better my way). I am now
sitting back after eight hours of emailing and phoning current clients, potential
clients and other freelances. Oh, and I brought a really sweet new mic, but more
on that later.
Google App Engine
As you probably know, Google has just launched App Engine. App Engine is a essentially a hosted application development platform that sits upon Google’s vast infrastructure. I could rattle on about this, but I think you should just watch the intro video from Google (at the end of this post).
The Hitchhikers Guide to PHP Load Balancing
There was once a time when running a big (or popular) web application meant running a big web server. As your application attracted more users you would add more memory and processors to your server.
Today, the ‘one huge server’ paradigm has been replaced with the idea of having a large number of smaller servers which employ one or more methods of balancing the load (known as ‘load balancing’) across the entire group (known as a ‘farm’ or ‘cluster’). This is partly down to the fall in hardware prices which made this approach more viable.
The Truth About PHP Variables
I wanted to write this post to clear up what seems to be a common misunderstanding in PHP – that using references when passing around large variables is a good way save memory. To fully explain this I will need to explain how PHP handles variables internally. I hope that you will find this interesting and useful and that it helps dispel some myths around references and memory management in PHP. First off, lets cover the basics…
Working With UK Postcodes
I was recently working on a project which needed to match a user with a county based on their (UK) postcode. In the end searched over Wikipedia and made extensive use of Google to come up with a small table which matches the postcode prefix with the postal town, county, and region.
How to Avoid Freelance Cabin Fever
I have been working as a freelancer for a few months now and I have to say it is wonderful. It is great being able to work where, when and (to some extent) on what I want. When I first went freelance it was more the ‘when’ and ‘on what’ aspects that I found appealing and I wasn’t overly worried about the ‘where’. After all, I (and I suspect many other web freelancers) found their interest in the web by tinkering away in their room until some ungodly hour of the night.
Fuzzy Searching in PHP: Part 1
For a while now I have been working on a developing a wiki application, both for personal projects and for use by clients. As part of this I needed to implement a search feature, allowing users to search the content on the wiki. At this point there were a few options open to me. All the content was stored in a MySQL database, so I could simply use a ‘LIKE’ statement against all the stored content, but there are several problems with this method:
Fuzzy Searching in PHP: Part 2
In part two we looked at how we spider our content and how that content can be stored in way which allows it to be searched. In part 2 I will show you how to actually perform searches on this indexed data.
Leave a Comment
Comments (2)
Leave a Comment














