Black box cloud pricing
Pricing servers used to be fairly simple, and still is in many cases. There are a number of variables – memory, storage, CPU, data transfer – which are combined to provide a price. Initially you might have to guess and future capacity planning is usually based on a known growth % but the values for each tend to be known quantities. You can easily work out how much your infrastructure costs now, and how much it will cost in the future.
Cloud computing has changed this from a fixed rate to one based on actual real time usage. Instead of a 12 month contract with physical hardware you pay per hour, or per GB transferred. This reduces initial costs and makes it easier to see your actual usage – no guessing how much additional capacity you might need.
But some cloud providers throw a spanner in the works with additional usage variables which seem like a black box – you have no idea what kind of figures to expect until you start using it.
For example, Amazon EC2 is priced per hour, a very simple method of billing which can easily be calculated. However, if you use their Elastic Block Storage you pay per provisioned GB as expected, but also for i/o operations – $0.10 per 1 million operations. Amazon provide an example for this:
As an example, a medium sized website database might be 100 GB in size and expect to average 100 I/Os per second over the course of a month. This would translate to $10 per month in storage costs (100 GB x $0.10/month), and approximately $26 per month in request costs (~2.6 million seconds/month x 100 I/O per second * $0.10 per million I/O).
But there are many variables that could affect i/o per second – OS caching, database access patterns (does it use RAM or flat files, and how much of each), service access patterns (do you have daily spikes, are images being uploaded) – and more generally, what the server is used for. There could also be fluctuations – a software upgrade could change how things work. In contract, purchasing a server with a 100GB disk would be 100 * $n where $n is the per GB price, the same with Amazon is (100 * $n) + some unknown i/o quantity.
It’s not just Amazon, the Rackspace Cloud Sites service gives you 100,000 “compute cycles” per month, with additional cycles billed at $0.01 each. Of course this isn’t the same as EC2 and EBS with it being more like a Google App Engine service (which also has CPU time billing) – the Rackspace Cloud Servers product uses “normal” pricing.
It might make sense to have compute cycles as a method of billing, such as with Amazon SimpleDB which charges for storage plus utilisation, where you don’t get access to the machines but even these are fairly obscure. For example:
Amazon SimpleDB measures the machine utilization of each request and charges based on the amount of machine capacity used to complete the particular request (SELECT, GET, PUT, etc.), normalized to the hourly capacity of a circa 2007 1.7 GHz Xeon processor.
How do you go about calculating this other than by using the service and testing queries?
So why might Amazon charge for EBS using a per i/o billing model? One theory is that it’s related to “wear” on the disks, particularly because EBS is run across multiple physical devices for durability, however that could just be factored into the per GB cost.
Our hosting infrastructure for our server monitoring service, Server Density, is billed based on resource allocation and doesn’t include these black box kind of billing calculations. This makes it very easy for us to work out how much our users cost us. This would be difficult with an i/o billing method – can you separate i/o requests on a user level? This can be important for calculating user value for marketing purposes.
The success of EC2 indicates that whilst this might be interesting, it isn’t really a major problem. However, I’d be interested to know how costing is worked out for anyone that does use EC2 or similar services. Do you see sudden changes in these charges and can the unknown factor be mitigated in other ways?
Enjoy this post? You may also like Making a point with SLAs