Throughout the summer the people at TechWorld have been busy testing some of the most used IaaS providers and their services. It was a substantial test, heavily focused towards the virtual machine and services around virtual machines.
The test was extensive, and when it came to performance TechWorld focused on Disk performance (IOPS, transfer rates) and network transfer rates (down and up loading from a single virtual machine). The infrastructure services of the following companies were tested:
- City Cloud (Sweden):
- Amazon EC2 (USA)
- Microsoft Azure (USA)
- Profitbricks (Germany)
- Ipeer (Sweden)
- Glesys (Sweden)
The number of IOPS and transfer rate from the disks to your VM is one of the most important aspects of general performance for servers in the cloud. There are other aspects of course, like how your CPU and host server is performing. However, in our experience, poor disk-performance is almost exclusively the cause when some one cannot run a service in the cloud (apart from legal or licensing aspects). For instance, if you have a very heavy database that handles large volumes and especially large number of requests – disk IO is key. In some cases, a huge database works best with specific SSD-drives that are attached directly to the compute node. However – these cases are becoming fewer and fewer as all providers of IaaS are continuously working to improve performance.
The basics of the performance test were:
- 16 simultaneous IO-operations
- 50% read and 50% write operations
- 50% random and 50% sequential data
- 15 seconds ramp up time (for disks to gain highest performance)
- 32 kb blocks
- 4 GB test file
Below we have graphed the values that TechWorld got in their test. We start with IOPS, which was one of the values that had the greatest spread among the services. EC2 from Amazon performed poorly as they do in many tests. I have many times pondered on what makes them so slow. The only thing I can think of is that they are the leader and do not need to be the best. I also have a feeling that they keep hardware longer than most providers. This may be true about the architecture as well.
With “only” 1530 IOPS one would think EC2 would be dead last. But not so fast! Azure (Microsoft) managed to edge them on the lower side by only getting its servers 1100 IOPS. In our world this is downright awful. There are a bunch of applications that will not work well when you come down to these low levels. If you build for the cloud you will most often spread all applications over many machines, since it improves performance in most cases. However, there are many companies out there with legacy apps that simply need a traditional VM to deliver decent performance. Azure somehow manages to only muster up 1100 IOPS for the servers started and tested. Making it the worst performer of IOPS.
The two other Swedish services together with Rackspace deliver between 3000 and 4000 IOPS, which will allow for a broad set of applications to work well. Profitbricks performs better than AWS but as you can see – that does not state much.
City Cloud delivers 8000 IOPS on average in the test. More than twice that of the second best performing service. When it comes to the two big providers City Cloud outperforms Microsoft Azure by more than seven times and Amazon EC2 more than five times.
Disk transfer rates
Disk transfer rates are of course also of interest and show a similar pattern as IOPS. The graph below shows the number of MB per second each VM can receive from the disks.
At 34 and 48 MB/s respectively, Azure and AWS performed well below average also in transfer rates coming in. Azure consistently performed worst among the tested services closely followed by AWS. City Cloud (at 240 MB/s) once again outperformed the other services. In fact, City Cloud had transfer rates that were double of those of the second best performer.
Depending on what type of work load you have latency is either important or extremely important. If you fetch large blocks of data latency matters less but the smaller the blocks become – the heavier the traffic back and forth to fetch those small blocks. Latency between host nodes and storage nodes become another crucial factor for general disk performance.
Below graph shows a similar pattern as before. AWS and Azure performs poorly followed by Profitbricks, Rackspace, Glesys and Ipeer. Again City Cloud stands, with only 2 ms average latency compared to between 4 and 14 ms for the competition.
Bandwidth / Transit
When testing how much data that can be pushed to and from a virtual machine the patterned change dramatically. This has little to do with disk performance, but rather with network in various layers.
Outgoing bandwidth topped out at around 790 Mbps where Azure was the fastest, closely followed by both Ipeer and City Cloud at 770 and 730 Mbps respectively. On the lower side Glesys stands out with only 98 Mbps which is not a desired speed at any level. Again AWS performed very poorly and only delivered 105 Mbps out. Rackspace delivered 190 Mbps outgoing. Comparing the worst with the best, there is a factor of 7 in difference.
Incoming bandwidth was delivered best by Rackspace at 750 Mbps. Ipeer was close behind at 710 Mbps and was the one whom delivered best looking at both in and out. City Cloud delivered 280 Mbps and AWS slightly better at 340. Again Glesys performed below 100 Mbps, with 97 Mbps incoming.
Out technical team has looked closely at these numbers and we have today made adjustments to make sure one VM can deliver both in and out closer to 900 Mbps. As we have many DoS-protecting layers we needed to make further adjustments for those systems not to cut any of the higher level in and outgoing traffic.
Overall City Cloud outperforms its competitors by at least factor one in disk performance. As developers know this is crucial both for being able to run apps well in the cloud as well as for financial reasons. What is the price difference if you only need one machine instead of three to deliver the same application? Clearly there is a lot of money to be saved.
Op5 ran some big real world tests during the summer where their operations team setup an environment in City Cloud (Stockholm DC) consisting of well more than 200 cores and 750 GB RAM. This they had tried but could not get to work well in AWS. City Cloud, on the other hand, provided a platform that performed very well. It also shows how well op5 monitor can work in very large environments of more than 150000 monitoring points.
Read more about op5s test here: https://kb.op5.com/display/~chrinils/2014/06/24/Set+up+a+large+op5+Monitor+environment
The test (Swedish): http://techworld.idg.se/2.2524/1.591983/basta-servern-i-molnet-2014
Take the chance to setup a City Cloud environment by registering here!