Ansible Quick Start

Ansible is a command line tool for installing and configuring software on remote servers.  It is open source, originally developed by Michael DeHaan around 2012, and owned by Red Hat since 2015.  Ansible is a gem of software design: powerful, but very easy to learn.  With a minimum of overhead, it can set up entire software stacks on servers for you using scripts.

There are many benefits to using Ansible over manual project admin.  Ansible scripts enable you to automate the sometimes tedious process of downloading files, uncompressing them, updating the config files, and so forth.  All this can be done quickly and accurately by scripts.  Ansible scripts are meant to be human-readable, hence you can use them as self-documenting procedures. And scripts are reusable.  You can carry them forward to new projects in many cases with little modification.

The most important reason for using Ansible is productivity.  It takes less than a week to become proficient with it.  Once you make it part of your tool set you can save hours on any given project by letting it take on the grunt work.  And that leaves you more time to do the interesting things.

I.  Installing Ansible

The computer where you install Ansible is called the control machine.  It must have a Unix variant OS–Windows is not currently supported for this purpose.  The install command is:

yum install ansible   # for Fedora, CentOS
apt-get ansible       # for Ubuntu
pip install ansible   # for Mac OSX

The package manager you use will add the dependencies automatically.  You can check the install with the which command.

which ansible
/usr/bin/ansible

II.  Allocate and Start a Remote Server

There are many ways you could provision a remote server for learning how Ansible works. For our purposes, the Lightsail service under Amazon Web Services is a good choice. Lightsail is very easy to use. It seems to be a streamlined version of the EC2 service, which is more powerful, but somewhat harder to learn.

You’ll need an AWS account and an SSH key pair to use Lightsail. Instructions for opening an AWS account and generating SSH keys can be found in Sections A & B of “A Project Template for AWS.” The SSH key pair consists of a private key and a public one; you download the private key to your local computer so that it can be used to authenticate you in an SSH session.

To launch a new server with Lightsail:

  1. Log on to AWS and go to Services => Compute => Lightsail.
  2. Leave the location as is, select Linux, OS only, Amazon Linux.
  3. Go with the default SSH key pair, select the $10/month machine size.
  4. Click [Create] button. Write down the public IP address of the new server instance.

Note: Amazon will charge your account an hourly rate for the server instance (about 1 cent/hr) while it is running. You can go to the Lightsail home page at any time to see the instance and put it into the stopped state. There is no charge for a stopped instance. To stop, start, or delete an instance, go to the menu in the upper right corner of the instance icon and select the corresponding option.

Let’s test our connectivity to the instance. At the command line of your local computer, type:

ssh -i <path-to-keypair-file/keypair-file> ec2-user@<instance-public-ip>

The keypair parameter is your previously downloaded keypair.pem file; the public IP is the IP address allocated by Lightsail for your server instance. Note: If you stop/start an instance, it loses its old IP and gets an newly allocated one.  You will probably see a warning message about authentication–just type ‘yes’.  The message only appears the first time you connect to the instance with a new IP address.

If everything is OK at this point, you should be logged on to your new remote instance and looking at the Amazon AMI welcome banner.  Use the logout command to exit the SSH session.

Let’s update the default /etc/ansible/hosts file with the information we now have about our remote managed node:

my-vm ansible_host=<public-ip-of-instance> \
    ansible_user=ec2-user \
    ansible_ssh_private_key_file=<path-to-private-key/private-key.pem>

Replace <public-ip-of-instance> with the IP of the instance; replace <path-to-private-key/private-key.pem> with the location and name of the private SSH key.  The first parameter, my-vm, is the name that Ansible will use to refer to the instance.  And ec2-user is a pre-defined user on Amazon Linux with sudo priviliges.

III.  How Ansible Works

The above diagram shows the important components of Ansible.

    1. There are two ways to execute commands: the ad hoc way for simple one-liners, and playbooks for arbitrarily complex command sequences.
    2. Information about managed nodes is maintained in an inventory file. The default inventory file is /etc/ansible/hosts.
    3. Command tasks are shipped to managed nodes using the SSH protocol and executed remotely.

An interesting thing about Ansible is that it does not require a helper program on the remote host to carry out a task. In this sense, Ansible is agentless.

IV. The ansible Command

In the preceding sections, we did some necessary groundwork.  We installed Ansible, generated an SSH key pair, downloaded the private SSH key, launched an instance in the cloud with Lightsail, and updated the default inventory file to include our new instance. Now we’re ready for the fun part.

The basic unit of Ansible activity is the task.  A task can be any number of things, such as installing a package, issuing a shell command, and so forth.  The ansible command is one of two ways to execute tasks; ansible-playbook is the other.  The former is for ad hoc tasks that you occasionally need to run, whereas the latter is for sequences of tasks that accomplish some larger goal. Here is a very useful ad hoc command:

ansible -m ping my-vm
my-vm | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}

The ping command simply tests your connectivity to one or more managed nodes.  The expected (good) response is pong.  Note that this result comes back to you as a JSON document preceded by the node name and the result status.  The last parameter in the above command is my-vm. This tells Ansible which node we want to run the task on.  As mentioned previously, all nodes are defined in the inventory file.  If we used a group name for this parameter, the task would be executed on each managed node in the group in turn and the individual results would be sent back to your terminal.  A group is a way to organize your managed nodes by function, business unit, or geographical location.  If you are interested in how groups are defined, see the commented-out examples in the default /etc/ansible/hosts config file.

Suppose we want to find out if the my-vm machine has Python.

ansible -m command -a 'python --version' my-vm
my-vm | SUCCESS | rc=0 >>
Python 2.7.12

The -m parameter specifies a module–in this case command.  A module in Ansible is a sub-command for a particular function.  The job that command does is execute a string instruction (the -a parameter) remotely.  In the previous example we used the ping module.  We can omit the -m parameter when running command, as it is the default module.

A closely related module to command is shell, which executes a string within a shell on the target machine.  The difference between the two is that shell can work with pipes, environment variables, and other things accessible in a shell, whereas command cannot.

ansible -m shell -a 'echo $JAVA_HOME' my-vm
my-vm | SUCCESS | rc=0 >>
/usr/lib/jvm/jre

Notice that the -a string is enclosed in single quotes.  Double quotes would tell the local shell to interpret all the $ variables before completing the command, thus garbling the command before it was sent to the target machine.

Let’s see if we can read one of the log files on my-vm.

ansible -m command -a 'cat /var/log/yum.log' my-vm
my-vm | FAILED | rc=1 >>
cat: /var/log/yum.log: Permission denied

The inventory file entry for my-vm specifies the remote user as ec2-user.  Apparently ec2-user doesn’t have the required read access.  But ec2-user has sudo privileges, and we can tell Ansible to act as the root user with the –become keyword (as in ‘become root’).

ansible -m command -a 'cat /var/log/yum.log' --become my-vm
my-vm | SUCCESS | rc=0 >>
Dec 29 00:31:06 Updated: nss-util-3.28.4-3.53.amzn1.x86_64
Dec 29 00:31:06 Updated: nss-softokn-freebl-3.28.3-8.41.amzn1.x86_64
Dec 29 00:31:06 Updated: nss-softokn-3.28.3-8.41.amzn1.x86_64
Dec 29 00:31:06 Updated: nss-sysinit-3.28.4-12.80.amzn1.x86_64
Dec 29 00:31:06 Installed: nss-pem-1.0.3-4.3.amzn1.x86_64
Dec 29 00:31:06 Updated: nss-3.28.4-12.80.amzn1.x86_64
Dec 29 00:31:08 Updated: nss-tools-3.28.4-12.80.amzn1.x86_64
Dec 29 00:31:08 Updated: wget-1.18-3.28.amzn1.x86_64
Dec 29 00:31:08 Updated: kernel-tools-4.9.70-22.55.amzn1.x86_64

As you can see, sometimes the result of an ad hoc command is simple text, and sometimes JSON. It seems that Ansible gives you the latter when you might need to parse the result.  One final and hoc example:

ansible -m ping -i other-inventory-file my-vm
my-vm | SUCCESS => {
    "changed": false, 
    "ping": "pong"
}

The -i option allows you to point to some inventory file other than the default /etc/ansible/hosts.  You might want to do this if you are working on several projects and you want to keep a separate inventory file for each.

V.  The ansible-playbook Command

The real power of Ansible is in playbooks.  A playbook is a script that specifies one or more tasks to be carried out on a target node, or multiple targets.  Whereas the ansible command can do a few types of tasks, ansible-playbook can do hundreds. Playbooks are programmable; they can have variables, loops, and conditionals.  They are compact and easy to code and maintain.

Let’s begin by creating a project directory on our local computer with the following structure:

ansible-tutorial
    group_vars
    templates

Now create a new YAML script named playbook-1.yml with a text editor. It should go in the top level directory, ansible-tutorial. Cut and paste the code below into it.

---
# Ansible Playbook 1
#
- hosts: my-vm
  vars:
    http_port: 80
    max_clients: 200
  remote_user: ec2-user
  become: yes
  tasks:
  - name: ensure apache is at the latest version
    yum: name=httpd state=latest         # shorthand parameter list
  - name: make Apache an auto-started service
    service:                             # regular parameter list
      name: httpd
      enabled: yes
  - name: start httpd now
    service: name=httpd state=started
  - name: copy over template welcome page
    template:
      src: templates/index.html
      dest: /var/www/html

What is YAML?  My definition is: YAML is JSON without the curly braces.  Both are text-based, small languages for passing data between programs and designed to be readily understood by humans.  Here are the major syntax rules for YAML:

  1. Three hyphens “—” is an optional marker for the beginning of a YAML document. Note the first line of playbook-1.yml, above.
  2. The # character is for comments.
  3. Simple key / value parameters are defined with colons.  Example -> http-port: 80
  4. Indentation with spaces is used to group elements, just like in Python.  Hosts, vars, and remote_user are part of a group in the above YAML file.
  5. Lines beginning with a hyphen are list elements.  There are two lists above: one at the top level, and one under tasks.

As you can see, YAML is very simple in itself.  Ansible builds on this syntax with its own rules, and this is where we actually get a playbook to do something.  A few notes on playbook-1.yml:

  1. Ansible expects a playbook to be a list of one or more plays.  A play begins with a hyphen + space in column 1.  Our first playbook has only one play.
  2. The hosts keyword is required.  It identifies the target node or a group of target nodes.  When there are more than one, Ansible will execute the playbook on each in turn.
  3. The vars keyword is optional; it can be used to define variables for the current play.  That said, the Ansible docs recommend placing such variable definitions in a separate file (We’ll get to that in a minute.).
  4. The remote-user and become settings tell Ansible to run as a specific user on the remote node and use sudo to obtain root privileges.  This may seem obscure, but the idea is to ensure that your tasks will have the right permissions to do their job.
  5. Now to tasks.  This is a list; note that each element task begins with a hyphen. The name parameter is an optional tag that gets logged to stdout when the task runs.  It’s a good way to monitor a playbook.
  6. The first task uses the yum module to install the latest version of the httpd package on the target.  Tasks are declarative; they specify what you want, but not how it’s done.  Yum checks to see if the package is already installed, and in this case, is the latest version.  If necessary, yum will install or upgrade the package.  But if the package is already up to date, yum does nothing.
  7. The next task uses the service module to make httpd a service that comes up at system startup.
  8. The next operation is to use service again to start httpd.
  9. Finally, we copy over an HTML welcome page.  We use the template module, which populates a text document with variables before sending it to the target.

Ansible has certain conventions for the directory structure of a project.  One of these is a folder, group_vars, which can contain text files for variable definitions.  Let’s add a file named all under group_vars as follows:

---
welcome_message: Welcome to Apache

The templates sub-directory is another convention. It holds text files with variables to be filled in when a playbook executes. Add a file index.html to templates with the following contents:

<html>
<head></head>
<h2>{{welcome_message}}</h2>
<h3>Some facts about host {{inventory_hostname}}</h3>
<table border="1">
  <tr><th>Variable</th><th>Value</th></tr>
  <tr><td>Linux Distro</td><td>{{ansible_distribution}}</td>
  <tr><td>Total Memory</td><td>{{ansible_memtotal_mb}} MB</td>
</table>
</html>

This is a template.  A variable reference is indicated by double curly braces, thus {{welcome_message}} will be replaced by our setting in the group_vars/all file, “Welcome to Apache.”  Notice that the template includes three other variables, namely {{inventory_hostname}}, {{ansible_distribution}}, and {{ansible_memtotal_mb}}.  The first is a pre-defined variable that contains the current target name. The other are special facts gathered by Ansible about the target machine before your tasks are executed.  You can use the following command to see what facts are available:

ansible -m setup my-vm
my-vm | SUCCESS => {
    "ansible_facts": {
        "ansible_all_ipv4_addresses": [
            "172.26.7.239"
        ], 
        "ansible_all_ipv6_addresses": [
            "fe80::78:71ff:feac:5fca"
        ], 
        "ansible_architecture": "x86_64", 
        "ansible_bios_date": "08/24/2006", 
        "ansible_bios_version": "4.2.amazon", 
        "ansible_cmdline": {
            "console": "ttyS0", 
            "root": "LABEL=/", 
            "selinux": "0"
        }, 
        "ansible_date_time": {
       - - -   and so on   - - -

Now we have a playbook and its associated variables file and template.  Will it run?

ansible-playbook playbook-1.yml

PLAY [my-vm] **********************************************************

TASK [setup] **********************************************************
ok: [my-vm]

TASK [ensure apache is at the latest version] *************************
changed: [my-vm]

TASK [make Apache an auto-started service] ****************************
changed: [my-vm]

TASK [start httpd now] ************************************************
changed: [my-vm]

TASK [copy over template welcome page] ********************************
changed: [my-vm]

PLAY RECAP ************************************************************
my-vm                      : ok=5    changed=4    unreachable=0    failed=0   

Success!

Now let’s go to the public IP address of the target server in a browser to verify that our welcome page is working.  Here’s a screenshot:

Notice that the variables have been filled in with the correct information.  This may seem like a waste of time for a welcome page, but consider how useful it would be for setting up config files.

VI.  Roles – a Powerful Way to Organize Playbooks

Our first playbook was small enough to understand at a glance. But what happens when you have to install a number of packages? Your playbook could quickly become hard to maintain. Ansible’s answer to this predicament is the role. A role is a special playbook that is meant to be run as a sort of subroutine by a higher level master playbook.  Let’s clone our ansible-tutorial project and recast the httpd install as a role.  Start copying the original project and rearranging the directory structure.

cp -r ansible-tutorial ansible-tutorial-with-role
cd ansible-tutorial-with-role
mkdir -p roles/httpd/tasks
mkdir -p roles/httpd/handlers
mv templates roles/httpd
mv playbook-1.yml playbook-2.yml
cp playbook-2.yml roles/httpd/tasks/main.yml
touch roles/httpd/handlers/main.yml
# new directory structure:
# ansible-tutorial-with-role
#     file: playbook-2.yml      - master level playbook
#     group_vars
#         file: all             - global variables
#     roles
#         httpd                 - top-level directory for httpd role
#             handlers          - sub-command to start service httpd
#             tasks             - steps for installing httpd
#             templates         - welcome page 

The standard practice when working with roles is to make a handler for any task that starts or stops a service.  It’s a cleaner way to organize your Ansible code than having such tasks inline. Also, handlers have an interesting behavior in that they can be notified that they should run by other tasks, but they will only run once at the end of a play. This means that several notifications to a given handler will result in it firing only once, and that at the end of a play. Let’s set up a handler for the httpd service. Edit roles/httpd/handlers/main.yml and copy in the task that starts httpd:

---
- name: start httpd now                   # note: task now begins in col 1
  service: name=httpd state=started

Now rework playbook-2.yml to use a role:

---
# Ansible Playbook 2
#
- hosts: my-vm                           
  vars:
    http_port: 80
    max_clients: 200
  remote_user: ec2-user
  become: yes
  roles:
    - httpd

The roles keyword is followed by a list of each role you want to process.  Here we have only one, but in a typical case there world be perhaps a half-dozen.  Note how clean the playbook looks now that we’ve abstracted out the low level tasks.

Edit roles/httpd/tasks/main.yml to obtain the following:

---
# note: tasks now begin in col 1
- name: ensure apache is at the latest version
  yum: name=httpd state=latest    
  notify:
    - start httpd now
- name: make Apache an auto-started service
  service:
    name: httpd
    enabled: yes
  notify:
    - start httpd now
- name: copy over template welcome page
  template:
    src: templates/index.html
    dest: /var/www/html

We have two places in the above script where we notify the httpd start handler.

Will our new playbook work?  Before trying it, let’s return our managed node to its original state.

ansible -m command -a 'service httpd stop' --become my-vm
my-vm | SUCCESS | rc=0 >>
Stopping httpd: [  OK  ]

ansible -m command -a 'yum remove -y httpd' --become my-vm
my-vm | SUCCESS | rc=0 >>
Loaded plugins: priorities, update-motd, upgrade-helper
Resolving Dependencies
--> Running transaction check
---> Package httpd.x86_64 0:2.2.34-1.16.amzn1 will be erased
--> Finished Dependency Resolution
   - - -   some lines omitted   - - -  
Complete!

ansible -m command -a 'rm /var/www/html/index.html' --become my-vm
my-vm | SUCCESS | rc=0 >>

Now to see if the role logic works.

ansible-playbook playbook-2.yml
PLAY [my-vm] *******************************************************************

TASK [setup] *******************************************************************
ok: [my-vm]

TASK [httpd : ensure apache is at the latest version] **************************
changed: [my-vm]

TASK [httpd : make Apache an auto-started service] *****************************
changed: [my-vm]

TASK [httpd : copy over template welcome page] *********************************
changed: [my-vm]

RUNNING HANDLER [httpd : start httpd now] **************************************
changed: [my-vm]

PLAY RECAP *********************************************************************
my-vm                      : ok=5    changed=4    unreachable=0    failed=0   


Success!

Once again, let’s check the welcome page in our browser.

Everything looks good.

VII. Summary

The purpose of this tutorial was to give you an introduction to the remarkable capabilities of the Ansible software management tool.  We started off by installing Ansible locally and spinning up a remote server using Amazon AWS’s Lightsail service.  We then proceeded to use Ansible in three ways:

  1. The ad hoc command ansible, which makes an easy starting point for seeing what the product can do.
  2. The ansible-playbook command, which is much more powerful, an lets you automate the bulk of your admin tasks.
  3. The roles feature of playbooks, which makes playbooks even more flexible and maintainable.

We only briefly touched on roles here.  One of their important advantages is that you can get a stunning variety of community-written roles on the Internet.  The Galaxy website is a giant repository for this purpose, maintained by the Ansible group at Red Hat.

If you want to become proficient with Ansible, start using it to do routine admin tasks.  I think you’ll find that it will become a valuable tool and save you a great deal of time and frustration.  Here are two helpful reference links:

  1. The Ansible home page.
  2. Ansible Modules Index

If you have any comments or questions about this tutorial, send me an email at datasciex@gmail.com