Many projects with Vagrant and Puppet
CEO & Founder of Server Density.
Published on the 7th May, 2013.
When we started Server Density v2, one of the main ideas was to build it as a collection of RESTful services, all talking over HTTP.
Initially, these were installed locally on a developer’s machine and set up via Apache vhosts running each component separately.
This soon became unmaintainable on a daily basis without a lot of work. We were spending too much time discussing whether a certain component was up to date and fighting bugs caused by API incompatibilities between versions. As we added more services to deal with things beyond the core of the product, this just got worse.
The answer was Vagrant.
What is Vagrant?
Essentially, vagrant is a command line tool for managing virtualbox instances (other backends are now available in the latest version). You use a pre built box (or package your own) to create a fresh virtual machine with all your tools installed, accessible via SSH from the host machine.
Once a vagrant box is configured, it’s just a case of
vagrant up, waiting a while and then a system is up and running.
Our vagrant box
We went through some iterations and experimentation to find something suitable for the way we wanted to work, combined with what we could actually achieve in the box. This is the end result for now, and I’ll list some modifications that we have planned but haven’t found the time to do.
The box is split in two, a base and an environment box:
The Base Box
The first stage in the vagrant build was to build a customised base box. We based it on Ubuntu 12.04, 64-bit build as that’s what we planned to use in production. Once that was chosen, I collected a set of dependencies and development tools that were required for each service (MongoDB, Apache, Node.js, vim, screen, etc). These were then deployed into the base box using a fairly standard puppet manifest with some available modules. At this stage, it installs just the tooling, dependencies and makes required system level config changes (networking setup, DNS entries).
Once this box is built, it’s uploaded to a development webserver so all the team have access to it. The Vagrantfile and the puppet manifests live in git, alongside the development box. This means that the base box can be recreated/tweaked/reviewed by anyone at any time if that’s a necessity.
Using a base box like this loses some flexibility. Every time you want to add something new at the base level, you have to rebuild and re upload the entire box. But it saves deployment time in the next stage, which overall results in a win.
You can temporarily work around this by adding a dependency into the environment box, but if you’re not strict with how you manage this it becomes a bit of a mess of where everything is, and you lose the advantage of having a pre-built base box.
The development environment
This is a little less straightforward than the base box build.
Starting with a Vagrantfile with the base box imported/declared, we added puppet modules to handle installing our agent into the box so it can report to itself, and a module to install apache vhosts.
Once that was done, we created a puppet module that can handle installing all of the services from git. This includes a clone/update, build process (buildout/composer), and finally run any test or development data scripts that we have. This was mostly a mash of already available bits and abuse of the exec puppet command.
The Vagrantfile is just a file of Ruby code, so it was easy to add something that set the box hostname based on the username of the host, and import a separate settings file for overriding the defaults for the vagrant settings (code checkout locations being the main one).
The final stage for this was to add a script that will kill and then start all the code that runs as a service (tornado/celery mainly). This grew out of an ugly hack involving starting lots of things in screen, and hasn’t really been updated to anything else. It does have a convenient advantage that `screen -list` will tell you exactly what is running, and a total number at the bottom for quick verification that everything started okay.
The code is checked out into a shared directory between the host and the guest, ultimately living on the host. This uses nfs for performance which means we can edit the code using the host editors and tools but the code will still run inside the vagrant box.
Debugging is taken care of by xdebug being configured to point to the host IP for the PHP services, and some work with WingIDE remote debugging for the python services, again with Wing configured to connect into the vagrant box.
Once the box was up and running, we added settings, configs and a custom domain (using vagrant-dns) to enable decent separation and ensure we don’t accidentally hardcode production/development URLs (or at least, that these are easier to catch if it does happen).
The main feature of this box is that the puppet manifests run with each provision. These update and redeploy the code each time, simplifying the update process to a single command, across every service and repository that we have deployed.
- Reproducible environment for everyone involved.
- The puppet manifests mean that just a
vagrant provisionthen waiting is enough to bring everything up to date with the latest master.
- Self contained stack, you can see what is running at any point.
- Closer to production. We mostly develop on OSX, but deploy to Linux, this gives us both.
- Shared URLs for testing. We can pop a URL from our vagrant machines into Hipchat, and other members of the team can use it locally, without having to change it. vagrant-dns is a big win for us there.
- Easy to install. Install virtualbox, vagrant, get the Vagrantfiles, run
- While there’s nothing in the vagrant configurations that couldn’t be as easily done with some scripting for the host machine and remove the need for virtualisation, it’s handy that when it goes wrong a fresh rebuild is just a
upaway. Extremely useful for testing system wide settings.
- A from-nothing
vagrant uptakes 25 minutes and downloads about 2Gb (1.2Gb for the base box, the rest for code + dependencies + extras).
vagrant provisionto update to latest can take up to 10 minutes depending on speed of the connection and the host machine, so it’s not that easy to ‘just test’ something. You can update the individual services manually, but then you have the problem that we started with, making sure that everyone has the same code.
- It’s hard work for the host machine. The box we have configured has 2 cores and 2Gb of RAM allocated. On a 4Gb Macbook Air, that can start getting a little close to resource starvation, particularly with an IDE and a debugger running.
- Debugging isn’t as easy as I’d like, setting up the debugger is fairly involved in settings, and you can only really debug one thing at a time. Can be awkward when you’re trying to trace values across multiple services.
- The configuration we have isn’t as close to production as I’d like (no nginx, no caching, no centralised logging) but this is just a matter of spending more time.
- It doesn’t entirely solve ‘it works for me’. It just becomes ‘it works on my vagrant’. Fortunately, instances of that seem to be a lot less common.
- Random virtualbox/vagrant/host problems. We’ve had boxes crash, networks go away and all manner of strange things. At least with the code living on the host machine, we’ve not lost work when that happens.
While it seems that there’s more disadvantages than advantages, overall the reproducibility and simplicity of reducing updating to a single command far outweigh the drawbacks of working in a virtualised environment.
Most of the future plans for this revolve around gradually bringing it in line with the production environment without losing the flexibility that we have gained.
- Use the production puppet manifests where possible. Our infrastructure is entirely puppet controlled, so I’d like to increase the reuse where possible.
- Create a version that uses the multi-vm capability of vagrant to simulate a cluster, with each service separately. Could be handy for looking at scaling/communication problems.
- See if we can reduce provision time even further, with possible build optimisations. This may then transfer to our deployment system.
- Move to vagrant 1.2 and test out some other backends.
- Remove the screen based development start script and move to something more production-like. (This is only used in the vagrant box, the live deployments use proper init scripts.)
Overall, the use of vagrant has been a big win for us as a company, and has reduced a lot of the problems we were having. There’s still some work to be done until we’re completely happy with it, but I’d recommend that anyone looking at building this type of project take a serious look to see if it suits them.