Managing servers with Ansible

We've undergone some pretty significant changes with our infrastructure at Report URI and one of the things that's made those changes a lot easier to handle was that we use Ansible to manage our entire fleet. This really came in handy for a particular, recent event.

An introduction to Ansible

Ansible is a really simple to way to manage any number of servers you like. You write scripts, called playbooks, and they're run against a list of hosts that you provide. All of our servers at Report URI are managed with Ansible and this makes things so easy. We can create a brand new server on DigitalOcean, from a bare Ubuntu 16.04.3 image, and have it up and running in a matter of minutes. It will be fully update and patched, git configured and app deployed, added to DNS and accepting traffic all after running a single playbook. Let's dive in and take a look at how we do it.

Defining hosts

The first thing to do in Ansible is define the lists of hosts you want to manage. In our hosts.list we can take of identifying all of our servers.

www1    ansible_ssh_host=
www2    ansible_ssh_host=
www3    ansible_ssh_host=
www4    ansible_ssh_host=

api1     ansible_ssh_host=
api2     ansible_ssh_host=
api3     ansible_ssh_host=
api4     ansible_ssh_host=

Here we have 2 groups of servers, our www and api servers, each with its own name and IP address. These are the server we will be managing with Ansible.

The Ansible Config

To tell Ansible where our list of hosts is defined we create ansible.cfg and add the following.

inventory = ./hosts.list
private_key_file = ./private.key
remote_user = root

I've added a couple of other config options there too. The ssh key I use to connect to the servers, which is highly recommended and the user that Ansible should login with on the remote hosts. With this in place we're ready to start configuring our servers.

Writing a Playbook

The next thing to do is write your playbook, which is all of the tasks you'd like ansible to run on the remote hosts. I call mine deploy.yml and here's a snippet of what a playbook might look like.

#!/usr/bin/env ansible-playbook
- hosts: www:api

  - name: Update cache.
    apt: update_cache=yes
    changed_when: false

  - name: Upgrade all packages
    apt: upgrade=yes

  - name: Run autoremove
    apt: autoremove=yes

  - name: Add ondrej/php repo
    apt_repository: repo='ppa:ondrej/php' update_cache=yes

  - name: Install software
    apt: name={{item}} state=installed
        - nginx
        - php7.2
        - php7.2-cli
        - php7.2-common
        - php7.2-curl
        - php7.2-fpm
        - php7.2-gd
        - php7.2-intl
        - php7.2-json
        - php7.2-mbstring
        - php7.2-soap
        - php7.2-xml
        - php7.2-zip
        - php-redis

The playbook will run against 2 groups of hosts, www and api, which includes all of the servers we defined earlier. You can run the playbook against single hosts too by specifying api2:api3 or even a combination of groups and single hosts www:api1:api4 as an example. This particular playbook will fully update the server before installing Nginx and PHP 7.2 on it followed by the rest of the script which configures git, PaperTrail, stunnel and various others things. Whatever you need to do to get a server ready for production use, you can do it with Ansible.

The Ansible Documentation

I could give a full example of our playbook but it's a pretty big file now that it's grown in complexity and I don't think it'd help that much. If there's a task that you need to do with Ansible you'll be able to find it in the documentation. I don't think it's worth going through exactly what we do, because everyone will have different needs, but if you need to do it, Ansible can do it and the documentation will help you get there!

Recent Changes with DigitalOcean

To start off 2018 strong, DigitalOcean announced new pricing plans for all of their servers. For the servers that we were using before the announcement we could get servers under the new pricing structure with 100% extra RAM and 25% extra SSD for the same price. Given that our Redis caches and report consumers are very RAM hungry, this was a pretty significant opportunity for us to upgrade. If you wanted to shift a server from the old pricing plan to the new pricing plan you had to power it down, switch the plan and wait for the SSD resize and RAM allocation to happen, boot it and then you were good to go again. That's a rather painful process as it means we have to drop them from DNS to stop the flow of traffic and then do the power down, reconfigure and reboot before we add them back to DNS. You'd probably have to do 1 or 2 at a time and we have over 40 servers in the public fleet and then private/managed instances for enterprise customers too. I'm so glad we had a much better way to handle this.

Rebuild the entire infrastructure

At first glance that sounds like a pretty dramatic solution to the problem but our servers are cattle, not pets. It's considerably easier and faster for me to spin up a whole new infrastructure under the new plans and then rip down the old servers than it is to even think about upgrading them one by one. I simply bring up the new servers, remove the old servers from DNS and once the changes have propagated, they can be destroyed. This is exactly what we did! It took me the better part of half an hour to do this and we were reaping the benefits of the new DigitalOcean pricing plans. This isn't the first time we've made significant changes with no impact to service either, we recently updated our whole fleet from PHP 7.0 to PHP 7.2 without a single hiccup!

Scaling made easy

One of the biggest things that I love about using Ansible is how fast we can make changes and one change that comes along a lot is increasing our capacity. From spinning up a new instance in DigitalOcean to having it online and accepting traffic can be anything from 1-3 minutes, that's it! The other great thing about Ansible is that, of course, it doesn't actually care where the servers are. It needs an IP address to talk to the server and that's it. I don't just mean DigitalOcean regions either and by having a diverse set of servers, I mean different hosting providers. The servers that we point Ansible at can be hosted in DigitalOcean, AWS, Azure or anywhere else that you like and this means we can bring capacity online with another host really quickly should we ever need to. It's a nice backup plan to have.

I have spent a lot of time looking at various options for hosting Report URI including multiple varieties of 'serverless' hosting where we just upload our code and let the provider take care of everything else. There are some great offerings out there and it takes away a lot of the effort involved, but there's always an associated price increase. For now we have the best setup to balance between how much work we have to put in and keeping the price down. Managing our whole fleet with Ansible is a breeze and removing the admin overhead of that isn't worth the increase in price right now. If you're managing Linux based servers and not using a tool like Ansible, I'd strongly recommend checking it out!