The Server Density storage backend – utility storage from Rackspace
Written by David Mytton
We just completed a move of our server monitoring service, Server Density, from “virtual/cloud” servers to a physical server with Rackspace (we were previously with Slicehost/The Rackspace Cloud and are still in the same US Rackspace data centre). Although the move itself is something I plan to blog about in the future, one of the key reasons for moving is the storage backend that we are now using.
For every server that we monitor, we store a relatively large quantity of data. Every 60 seconds, our monitoring agent reports back statistics such as CPU load, memory usage and a list of all running processes. We store the last received data so we can do “real time” alerting against and displaying of the very latest statistics and keep data at a 5 minute sample rate intervals. This means that per server we store 12 data points per hour, 288 per day and 8640 per 30 days. That’s quite a lot of data.
As blogged about previously, we recently moved to MongoDB, a non-relational document based storage system for our backend database. This allows us to have extremely fast queries and make the most efficient use of our storage space.
MongoDB uses a flat file structure and we store all our data within a single “database”, which is currently about 60GB in size. The problem we were continually facing with our previous server setup was quickly using up all available disk space. This was partly due to limited space available on the server but also due to inefficient data storage. Both of these problems have now been solved.
Our server with Rackspace has internal RAID1 storage which we use for the OS, the application code and backups (as part of our backup strategy). However, the database itself is run off a network attached storage device. This is provided by Rackspace’s Utility Network Attached Storage (uNAS) product in that we have “unlimited” disk space and pay on a per GB basis. It is mounted as a filesystem in Linux and so can be compared to the Amazon Elastic Block Storage (EBS) service, except that EBS is limited to 1TB of data.
david@pan ~: df -ah
Filesystem Size Used Avail Use% Mounted on
1.9T 59G 1.8T 4% /mnt/unas
The uNAS is priced at $0.70 per GB. As our usage increases, we expect to be able to negotiate this price down as we can commit to a specific minimum amount each month. Amazon’s EBS is priced at $0.10GB of provisioned storage. The key here is “provisioned”. With the uNAS you have “unlimited” available disk space and pay for what you actually use. With EBS you have to set up a pre-defined amount e.g. 100GB, and you pay for that regardless of whether you’re using 1GB or 100GB.
You would need to provision more than you are using with EBS because re-provisioning the volume requires taking a snapshot of the data and starting up a new volume from that snapshot. This takes time and involves downtime whilst the volume is built.
EBS also has other charges – $0.10 per 1 million I/O requests. When we tested EBS over a 24 hour period, we used 100 million I/O requests. There are also backup (snapshot) fees because you are backing up to S3. These are priced at $0.15 per GB-month of data stored and there are fees for PUT and GET requests. Whilst the “at a glance” pricing of EBS is cheaper than uNAS, there are other factors at play. You have multiple variables, not just disk usage.
When choosing how to grow our server infrastructure, we looked at a number of solutions. A more detailed write-up of this will be published later but an important factor was development time. Whilst we’re paying more to Rackspace for the servers than we would at Amazon, the difference is actually not that much and given the time it would have taken to build our own management systems, it would have actually been more expensive to stay on a “cloud” environment. That’s not to mention the support and uptime guarantees provided by Rackspace – uptime is important to a server monitoring service!
Being able to scale our data storage indefinitely simply by paying for more disk space is a good short & medium term solution whilst we are still a small company. Our ultimate plan is to shard our data across multiple servers, not only for redundancy but also to keep disk usage efficient. The Rackspace uNAS product allows us to support our current growth whilst longer term plans are finalised.