この記事は日本語でもご覧頂けます。 詳細はこちら。

Lessons from using Ansible exclusively for 2 years.

ansible-automation

By Corban Raun, of Raunco
Published on the 24th March, 2015.

Today we’re really happy to be sharing an awesome guest post written by Corban Raun. Corban has been working with Ansible for ~2 years and is responsible for developing our Ansible playbook!

He’s been trying to automate systems administration since he started learning linux many years ago. If you’d like to learn more, say thanks or ask any questions you can find Corban on Twitter (@corbanraun) and on his website. So without further ado, here’s his great article about Ansible – enjoy!


As a Linux Systems Administrator, I came to a point in my career where I desperately needed a configuration management tool. I started looking at products like Puppet, Chef and SaltStack but I felt overwhelmed by the choice and wasn’t sure which tool to choose.

I needed to find something that worked, worked well, and didn’t take a lot of time to learn. All of the existing tools seemed to have their own unique way of handling configuration management, with many varying pros and cons. During this time a friend and mentor suggested I look into a lesser known product called Ansible.

No looking back

I have now been using Ansible exclusively for ~2 years on a wide range of projects, platforms and application stacks including Rails, Django, and Meteor web applications; MongoDB clustering; user management; CloudStack setup; and monitoring.

I also use Ansible to provision cloud providers like Amazon, Google, and DigitalOcean; and for any task or project that requires repeatable processes and a consistent environment (which is pretty much everything).

DevOps automation ansible

Credit DevOps Reactions – Continuous delivery

Ansible vs Puppet, Chef and Saltstack

One reason I chose Ansible was due to its ability to maintain a fully immutable server architecture and design. We will get to exactly what I mean later, but it’s important to note – my goal in writing this post is not compare or contrast Ansible with other products. There are many articles available online regarding that. In fact, some of the things I love about Ansible are available in other configuration management tools.

My hope with this article is actually to be able to give you some Ansible use cases, practical applications, and best practices; with the ulterior motive of persuading you that Ansible is a product worth looking into. That way you may come to your own conclusions about whether or not Ansible is the right tool for your environment.

Immutable Server Architecture

When starting a new project with Ansible, one of the first things to think about is whether or not you want your architecture to support Immutable servers. For the purposes of this article, having an Immutable server architecture means that we have the ability to create, destroy, and replace servers at any time without causing service disruptions.

As an example, lets say that part of your server maintenance window includes updating and patching servers. Instead of updating a currently running server, we should be able to spin up an exact server replica that contains the upgrades and security patches we want to apply. We can then replace and destroy the current running server. Why or how is this beneficial?

By creating a new server that is exactly the same as our current environment including the new upgrades, we can then proceed with confidence that the updated packages will not break or cause service disruption. If we have all of our server configuration in Ansible using proper source control, we can maintain this idea of Immutable architectures. By doing so we can keep our servers pure and unadulterated by those who might otherwise make undocumented modifications.

Ansible allows us to keep all of our changes centralized. One often unrealized benefit of this is that our Ansible configuration can be looked at as a type of documentation and disaster recovery solution. A great example of this can be found in the Server Density blog post on Puppet.

This idea of Immutable architecture also helps us to become vendor-agnostic, meaning we can write or easily modify an Ansible playbook which can be used across different providers. This includes custom datacenter layouts as well as cloud platforms such as Amazon EC2, Google Cloud Compute, and Rackspace. A really good example of a multi vendor Ansible playbook can be seen in the Streisand project.

Use Cases

Use Case #1: Security Patching

Ansible is an incredibly powerful and robust configuration management system. My favorite feature? Its simplicity. This can be seen by how easy it is to patch vulnerable servers.

Example #1: Shellshock

The following playbook was run against 100+ servers and patched the bash vulnerability in less than 10 minutes. The below example updates both Debian and Red Hat Linux variants. It will first run on half of all the hosts that are defined in an inventory file.

- hosts: all
  gather_facts: yes
  remote_user: craun
  serial: "50%"
  sudo: yes
  tasks:
    - name: Update Shellshock (Debian)
      apt: name=bash
           state=latest
           update_cache=yes
      when: ansible_os_family == "Debian"

    - name: Update Shellshock (RedHat)
      yum: name=bash
           state=latest
           update_cache=yes
      when: ansible_os_family == "RedHat"

Example #2: Heartbleed and SSH

The following playbook was run against 100+ servers patching the HeartBleed vulnerability. At the time, I also noticed that the servers needed an updated version of OpenSSH. The below example updates both Debian and RedHat linux variants. It will patch and reboot 25% of the servers at a time until all of the hosts defined in the inventory file are updated.

- hosts: all
  gather_facts: yes
  remote_user: craun
  serial: "25%"
  sudo: yes
  tasks:
    - name: Update OpenSSL and OpenSSH (Debian)
      apt: name={{ item }}
           state=latest
           update_cache=yes
      with_items:
        - openssl
        - openssh-client
        - openssh-server
      when: ansible_os_family == "Debian"

    - name: Update OpenSSL and OpenSSH (RedHat)
      yum: name={{ item }}
           state=latest
           update_cache=yes
      with_items:
        - openssl
        - openssh-client
        - openssh-server
      when: ansible_os_family == "RedHat"
  post_tasks:
    - name: Reboot servers
      command: reboot

Use Case #2: Monitoring

One of the first projects I used Ansible for was to simultaneously deploy and remove a monitoring solution. The project was simple: remove Zabbix and replace it with Server Density. This was incredibly easy with the help of Ansible. I ended up enjoying the project so much, I open sourced it.

One of the things I love about Ansible is how easy it is to write playbooks, and yet always have room to improve upon them. The Server Density Ansible playbook, is the result of many revisions to my original code that I started a little over a year ago. I continually revisit and make updates using newfound knowledge and additional features that have been released in the latest versions of Ansible.

Everything Else

Ansible has many more use cases than I have mentioned in this article so far, like provisioning cloud infrastructure, deploying application code, managing SSH keys, configuring databases, and setting up web servers. One of my favorite open source projects that uses Ansible is called Streisand. The Streisand project is a great example of how Ansible can be used with multiple cloud platforms and data center infrastructures. It shows how easy it is to take something difficult like setting up VPN services and turning it into a painless and repeatable process.

Already using a product like Puppet or SaltStack? You can still find benefits to using Ansible alongside other configuration management tools. Have an agent that needs to be restarted? Great! Ansible is agentless, so you could run something like:

ansible -i inventories/servers all -m service -a "name=salt-minion state=restarted" -u craun -K --sudo

From the command line to restart your agents. You can even use Ansible to install the agents required by other configuration management tools.

Best practices

In the last few years using Ansible I have learned a few things that may be useful should you choose to give it a try.

Use Ansible Modules where you can

When I first started using Ansible, I used the command and shell modules fairly regularly. I was so used to automating things with Bash that it was easy for me to fall into old habits. Ansible has many extremely useful modules. If you find yourself using the command and shell modules often in a playbook, there is probably a better way to do it. Start off by getting familiar with the modules Ansible has to offer.

Make your roles modular (i.e. reusable)

I used to maintain a separate Ansible project folder for every new application stack or project. I found myself copying the exact same roles from one project to another and making minor changes to them (such as Nginx configuration or vhost files). I found this to be inefficient and annoying as I was essentially repeating steps. It wasn’t until I changed employers that I learned from my teammates that there is much better way to set up projects. As an example, one thing Ansible lets you do is create templates using Jinja2. Let’s say we have an Nginx role with the following nginx vhost template:

server {
  listen 80;

  location / {
    return 302 https://$host$request_uri;
  }
}

server {
  listen 443 ssl spdy;
  ssl_certificate    /etc/ssl/certs/mysite.crt;
  ssl_certificate_key    /etc/ssl/private/mysite.key;
  server_name www.mysite.com 192.168.1.1;

  location / {
    root   /var/www/public;
    index  index.html index.htm;
  }
}

While the above example is more than valid, we can make it modular by adding some variables:

server {
  listen 80;

  location / {
    return 302 https://$host$request_uri;
  }
}

server {
  listen 443 ssl spdy;
  ssl_certificate    {{ ssl_certificate_path }};
  ssl_certificate_key    {{ ssl_key_path }};
  server_name {{ server_name }} {{ ansible_eth0.ipv4.address }};
  location / {
    root   {{ web_root }};
    index  index.html index.htm;
  }
}

We can then alter these variables within many different playbooks while reusing the same Nginx role:

- hosts: website
  gather_facts: yes
  remote_user: craun
  sudo: yes
  vars: 
    ssl_certificate_path: "/etc/ssl/certs/mysite.crt"
    ssl_key_path: "/etc/ssl/private/mysite.key"
    server_name: "www.mysite.com"
    web_root: "/var/www/public"
  roles:
    - nginx

Test, Rinse, Repeat

test-rinse-repeat

Credit DevOps Reactions – Writing Unit Tests

Test your changes, and test them often. The practice and idea of testing out changes is not a new one. It can, however become difficult to test modifications when both sysadmins and developers are making changes to different parts of the same architecture. One of the reasons I chose Ansible is its ability to be used and understood by both traditional systems administrators and developers. It is a true development operations tool.

For example, it’s incredibly simple to integrate Ansible with tools like HashiCorp’s Vagrant. By combining the tools, you and your developers will be more confident that what is in production can be repeated and tested in a local environment. This is crucial when troubleshooting configuration and application changes. Once you have verified and tested your changes with these tools you should have relatively high confidence that your changes should not break anything (remember what immutable means?).

What now?

As mentioned previously, my goal was not to compare Ansible to other products; after all you can find uses for it in environments where you already have other configuration management tools in place; and some of the features I have talked about are even available in other products.

Hopefully this article gave you an idea as to why Ansible may be useful in your server architecture. If you only take one thing from this article, let it be this: Ansible can help you maintain and manage any server architecture you can imagine, and it’s a great place to get started in the world of automation.

Free eBook: The 9 Ingredients of Scale

From two students with pocket money, to 20 engineers and 80,000 servers on the books, our eBook is a detailed account of how we scaled a world-class DevOps team from the ground up. Download our definitive guide to scaling DevOps and how to get started on your journey.

Help us speak your language. What is your primary tech stack?

What infrastructure do you currently work with?

  • Michael Bubb

    Interesting piece. I have been working with Ansible (also exclusively) for the past 6 months. Overall – I like it.

    I am curious – do you have a system for keeping the inventory in sync? Do you use ansible for user management? How do you ensure state across servers at any one time? ( we are adopting Tower as a partial answer to the last question )

    thanks

    Michael

  • corbanraun

    Hi Michael,

    Those are great questions.

    Much of our infrastructure is on AWS so we us dynamic inventories with tagging. We can then limit Ansible agains’t specific servers. For example, all of our bastion hosts would be tagged with something like `Role: Bastion`. We can then limit by `ansible-playbook -i ec2.py blah.yml –limit ‘tag_role_bastion’`. Tower also has the ability to do inventory syncing.

    We do in fact use Ansible for user management. We have a users-common role, which contains users public ssh keys and common tasks. This role can then be called from other roles via meta. For example: we could have a role called dba_users with a meta that calls users-common with info (such as sudo, group etc) that is specific to dbas, which will create the specified users. We also have a remove users role which we update when a user leaves the company. That can be run on all servers effectively removing access

    As far as maintaining state, Tower or Ansible pull could be used. Having an Immutable Server Architecture in theory, should force state :).

    • Joe

      interesting article, how would do in case you two environments (VPC) staging and production and would like to limit you playbook to staging environment only?

    • Govindaraj Venkatesan

      corbanraun How does Ansible pull works? Does this work against dynamic inventory? I tried using the below but for some reason couldn’t get the host list. Any thoughts?

      ansible-pull -d /home/gvenka008c/caps-sps –inventory=/home/gvenka008c/caps-sps/jenkins_mesos/ansible/inventory/ –key-file=/home/gvenka008c/.ssh/gvenka008c -U git@github.com:XPlat/caps-sps.git /home/gvenka008c/caps-sps/jenkins_mesos/ansible/jenkins_mesos_icinga.yml

      PLAY ***************************************************************************

      skipping: no hosts matched

  • I’m about to try Ansible – thanks for the write up!

  • steve

    Thank you for this article!
    We currently use Ansible for Linux systems, both onprem and AWS and I really would like to leverage it for use with Windows systems. Having the 1 tool across the team I think would be a great benefit and keep things simple.
    Seeing it uses SSH, I believe to get it working on Windows I would need Windows Management Framework / Powershell.
    Where does Chocolatey come in. Does Ansible have to use this package manager in order for me to use it on Windows systems or not? Im a bit confused?
    Most of our Windows estate is Wk2 R2. This means I have a lot of upgrades to do to get it to work, as Win 2k12 R2 is already at the right Framework/ Powershell versions.
    Any easy way of upgrading 700 Vms?!?! We don’t have any other management tool unfortunately.
    Thank you
    Steve

  • Robert Smith

    How did you deal with other departments/teams at your company that may have already been using Chef/Puppet as their CM tool?

    • corbanraun

      Hi Robert,
      At a previous company I worked at, they were using Chef quite extensively and did not really like it. I was lucky enough to be one of two admins, and the other Admin had already decided to start using Ansible. Since the company was mainly a Ruby shop, we has some push back from developers, but once they saw how easy using Ansible was it was easy to get everyone on board with the transition.

      At another company I work for, things were slightly more difficult. We were (and to a much lesser extent still are) using an old out dated version of Puppet. Using this particular version of Puppet has been quite painful (Although some old school admins disagree with me). One of my co workers at this company had been using Ansible quite extensively and was advocating for it. The rest of the team wasn’t really sure which tool to use, and were not yet convinced Ansible was the right choice. This co worker did a compare and contrast between Chef, Salt, Puppet and Ansible. By the end of the meeting people were on board with at least trying Ansible. Eventually it was decided to use Ansible exclusively.

      One thing I will say is that migrating from something like Chef/Puppet to Ansible can be difficult. Chef/Puppet are pull models and Ansible is a push model. While they have some similarities, you may get some curmudgeonly admins who dislike Ansible due to their lack of experience with it. They may not understand, or agree with the benefits of a push model. It may even require a stake holder with decision making power (Such as a manager) to explicitly say you need to move to Ansible.

      How we handled the transition, was to ensure any new services or applications exclusively used Ansible. As we have had time we have been migrating old systems from Puppet to Ansible. This has actually been quite beneficial when dealing with old systems. It has forced us to update packages that we had previously been afraid to update, due to Puppet dependencies.

      While a bit out dated (As many things have changed) this is still a fairly good article https://missingm.co/2013/06/ansible-and-salt-a-detailed-comparison/ listing some of the benefits of using Ansible.

      IMHO at this time, having dealt with Puppet, Chef, Salt, and Ansible, I would choose Ansible followed by Salt.

      • Robert Smith

        Did you have buy in from the executive level? We tried the “meeting and discussing CM Tools approach”. In the end, everyone agreed to try Ansible. However, not soon after certain developer leads went to the CTO and cried. We are still using Chef now.

        • corbanraun

          In one of the cases I mentioned, yes. An email from a manager was sent out essentially saying. “We made the decision to use Ansible. Unless something goes horribly wrong.. were sticking with Ansible”.

          It’s hard when you have a team that has been using a configuration management tool for a very long time. They are familiar with it’s benefits as well as it’s problems. Familiarity and fear of the unknown,can cause resistance to change, regardless of whether the change is good or bad.

          For me, Ansible has been a breath of fresh air. It’s easy, it makes sense, and it’s fast. It’s that way for many people I know. I also know people who will live and die by Puppet/Chef, or Salt. It really comes down to what is best for your organization. Will Ansible improve your work flow? My employer was going to have to make a large change either way. We were going to have to upgrade Puppet (Many updates were required, as we had lots of technical debt) or choose something new. We chose to go with something new.

          I hope that helps, and I wish you luck. Please let me know if you have any more questions.

  • TurboChargedDad

    I keep hearing all this talk about how “east” Ansible is. I have been battling ansible from a formatting perspective for hours. Apparently it’s extremely picking about formatting. Which is ridiculous. Going back to puppet. :./

    • kartone

      @turbochargeddad:disqus it’s because of his python’s nature…

  • lakshman

    +1 for mentioning about vendor-agnostic

  • Sivaram Kannan

    Great write up. Thanks. I have been using Ansible for about year now. Although I am mostly happy with the orchestration in general, how do you get the current running state of applications across the cluster. Let me give an example, say I am upgrading one particular docker container with a newer version, and one of the nodes fail for some reason and you fail to fix it at that instant – you have two versions running in the cluster for the same software. Although this seems tivial, but without a way to knowing the current state (like puppet of salt) of a node, this is problematic thing. I know tower can solve the problem, is that the only solution for this problem?

    • Mario Rivera

      Kubernetes.

  • Mario Rivera

    You need an ssh connection, so I suppose the answer is no.

Articles you care about. Delivered.

Help us speak your language. What is your primary tech stack?

Maybe another time