Removing Memcached because it’s too slow

By David Mytton,
CEO & Founder of Server Density.

Published on the 11th May, 2012.

Update There has been a lot of discussion of this post around whether this is a problem with Memcached or something else. The post content is accurate but in hindsight, the use of “Memcached” should really have been “Membase + Moxi”. Membase provides additional tools on top of Memcached so whilst Memcached itself wasn’t slow, the use of the Moxi proxy to provide failover was. The argument is that what comes out of the box with regards failover with MongoDB is similar to what Membase + Moxi provides so whilst it’s accurate that Memcached was replaced because it was faster to use MongoDB than have to rely on the Moxi proxy, this should’ve been clearer.

We’ll shortly be deploying some changes to the Server Density codebase to remove Memcached as component in the system. We currently use it for 2 purposes:

  1. UI caching: the initial load of your account data e.g. server lists, alert lists, users lists, are taken directly from the MongoDB database and then cached until you made a change to the data, when we invalidate the cache.
  2. Throttling: the performance impact of the global lock in MongoDB 1.8 was such that we couldn’t insert our monitoring postback data directly into MongoDB – it had to be inserted into Memcached first then throttled into MongoDB via a few processor daemons (as opposed to larger numbers of web clients).

Performance map

This has worked well for over a year now but with the release of MongoDB 2.0, the impact of the global lock is significantly reduced because of much smarter yielding. This is only set to get better with database level locking in 2.2 and further concurrency improvements in future releases.

We’ve already removed throttling from other aspects of our codebase but our performance metrics show that we’re now finally able to remove Memcached completely, because directly accessing MongoDB is significantly faster. Indeed, our average database query response time is 0.43ms compared to 24.2ms from Memcached.

Database throughput

Response time

We have a number of MongoDB clusters and these figures are for our primary data store where all application data lives (separate from our historical time series data). There are x2 shards made up of x4 data nodes in each shard, x2 per data centre (Washington DC and San Jose in the US). They are dedicated servers running Ubuntu 10.04 LTS with 8GB RAM, Intel Xeon-SandyBridge Quad Core 3.4Ghz CPUs, 100GB SSDs for the MongoDB data files and connected to a 2Gbps internal network.


Removing Memcached as a component simplifies our system even further so our core technology stack will only consist of Apache, PHP, Python, MongoDB and Ubuntu. This eliminates the need for Memcached itself running on a separate cluster, the Moxi proxy to handle failover, additional monitoring for another component and a different scaling profile. Getting memcached libraries for PHP and Python is also a pain if you want to use officially supported packages (through Ubuntu LTS) especially when you want to use later releases. And we can get rid of that extra 24ms of response time.

Free eBook: The 9 Ingredients of Scale

From two students with pocket money, to 20 engineers and 80,000 servers on the books, our eBook is a detailed account of how we scaled a world-class DevOps team from the ground up. Download our definitive guide to scaling DevOps and how to get started on your journey.

Help us speak your language. What is your primary tech stack?

What infrastructure do you currently work with?

Articles you care about. Delivered.

Help us speak your language. What is your primary tech stack?

Maybe another time