Testing infrastructure with serverspec

Vincent Bernat January 1, 2014

Checking if your servers are configured correctly can be done with IT automation tools like Puppet, Chef, Ansible or Salt. They allow an administrator to specify a target configuration and ensure it is applied. They can also run in a dry-run mode and report servers not matching the expected configuration.

On the other hand, serverspec is a tool to bring RSpec, a testing tool for the Ruby programming language frequently used for test-driven development, to the infrastructure world. It can be used to remotely test server state through an SSH connection.

Why one would use such an additional tool? Many things are easier to express with a test than with a configuration change, like for example checking that a service is correctly installed by checking it is listening to some port.

Getting started
Advanced use

Getting started#

Good knowledge of Ruby may help but is not a prerequisite to the use of serverspec. Writing tests feels like writing what we expect in plain English. If you think you need to know more about Ruby, here are two short resources to get started:

serverspec’s homepage contains a short and concise tutorial on how to get started. Please, read it. As a first illustration, here is a test checking a service is correctly listening on port 80:

describe port(80) do
  it { should be_listening }
end

The following test will spot servers still running with Debian Squeeze instead of Debian Wheezy:

describe command("lsb_release -d") do
  it { should return_stdout /wheezy/ }
end

Conditional tests are also possible. For example, we want to check the miimon parameter of bond0, but only when the interface is present:

has_bond0 = file('/sys/class/net/bond0').directory?

# miimon should be set to something other than 0, otherwise, no checks
# are performed.
describe file("/sys/class/net/bond0/bonding/miimon"), :if => has_bond0 do
  it { should be_file }
  its(:content) { should_not eq "0\n" }
end

serverspec comes with a complete documentation of available resource types (like port and command) that can be used after the keyword describe.

When a test is too complex to be expressed with simple expectations, it can be specified with arbitrary commands. In the below example, we check if memcached is configured to use almost all the available system memory:

# We want memcached to use almost all memory. With a 2GB margin.
describe "memcached" do
  it "should use almost all memory" do
    total = command("vmstat -s | head -1").stdout # ❶
    total = /\d+/.match(total)[0].to_i
    total /= 1024
    args = process("memcached").args # ❷
    memcached = /-m (\d+)/.match(args)[1].to_i
    (total - memcached).should be > 0
    (total - memcached).should be < 2000
  end
end

A bit more arcane, but still understandable: we combine arbitrary shell commands (in ❶) and use of other serverspec resource types (in ❷).

Advanced use#

Out of the box, serverspec provides a strong fundation to build a compliance tool to be run on all systems. It comes with some useful advanced tips, like sharing tests among similar hosts or executing several tests in parallel.

I have setup a GitHub repository to be used as a template to get the following features:

assign roles to servers and tests to roles;
parallel execution;
report generation & viewer.

Host classification#

By default, serverspec-init generates a template where each host has its own directory with its unique set of tests. serverspec only handles test execution on remote hosts: the test execution flow (which tests are executed on which servers) is delegated to some Rakefile.¹ Instead of extracting the list of hosts to test from a directory hierarchy, we can extract it from a file (or from an LDAP server or from any source) and attach a set of roles to each of them:

hosts = File.foreach("hosts")
  .map { |line| line.strip }
  .map do |host|
  {
    :name => host.strip,
    :roles => roles(host.strip),
  }
end

The roles() function should return a list of roles for a given hostname. It could be something as simple as this:

def roles(host)
  roles = [ "all" ]
  case host
  when /^web-/
    roles << "web"
  when /^memc-/
    roles << "memcache"
  when /^lb-/
    roles << "lb"
  when /^proxy-/
    roles << "proxy"
  end
  roles
end

In the snippet below, we create a task for each server as well as a server:all task that will execute the tests for all hosts (in ❶). Pay attention, in ❷, at how we attach the roles to each server.

namespace :server do
  desc "Run serverspec to all hosts"
  task :all => hosts.map { |h| h[:name] } # ❶

  hosts.each do |host|
    desc "Run serverspec to host #{host[:name]}"
    ServerspecTask.new(host[:name].to_sym) do |t|
      t.target = host[:name]
      # ❷: Build the list of tests to execute from server roles
      t.pattern = './spec/{' + host[:roles].join(",") + '}/*_spec.rb'
    end
  end
end

You can check the list of tasks created:

$ rake -T
rake check:server:all      # Run serverspec to all hosts
rake check:server:web-10   # Run serverspec to host web-10
rake check:server:web-11   # Run serverspec to host web-11
rake check:server:web-12   # Run serverspec to host web-12

Then, you need to modify spec/spec_helper.rb to tell serverspec to fetch the host to test from the environment variable TARGET_HOST instead of extracting it from the spec file name.

Parallel execution#

By default, each task is executed when the previous one has finished. With many hosts, this can take some time. rake provides the -j flag to specify the number of tasks to be executed in parallel and the -m flag to apply parallelism to all tasks:

$ rake -j 10 -m check:server:all

Reports#

rspec is invoked for each host. Therefore, the output is something like this:

$ rake spec
env TARGET_HOST=web-10 /usr/bin/ruby -S rspec spec/web/apache2_spec.rb spec/all/debian_spec.rb
......

Finished in 0.99715 seconds
6 examples, 0 failures

env TARGET_HOST=web-11 /usr/bin/ruby -S rspec spec/web/apache2_spec.rb spec/all/debian_spec.rb
......

Finished in 1.45411 seconds
6 examples, 0 failures

This does not scale well if you have dozens or hundreds of hosts to test. Moreover, the output is mangled with parallel execution. Fortunately, rspec comes with the ability to save results in JSON format. These per-host results can then be consolidated into a single JSON file. All this can be done in the Rakefile:

For each task, set rspec_opts to --format json --out ./reports/current/#{target}.json. This is done automatically by the subclass ServerspecTask which also handles passing the hostname in an environment variable and a more concise and colored output.
Add a task to collect the generated JSON files into a single report. The test source code is also embedded in the report to make it self-sufficient. Moreover, this task is executed automatically by adding it as a dependency of the last serverspec-related task.

Have a look at the complete Rakefile for more details on how this is done.

A very simple web-based viewer can handle these reports.² It shows the test results as a matrix with failed tests in red:

Report viewer example — Short report example as displayed by the viewer

Clicking on any test will display the necessary information to troubleshoot errors, including the test short description, the complete test code, the expectation message and the backtrace:

Report viewer showing detailed error — A failed test displayed in the viewer

I hope this additional layer will help making serverspec another feather in the “IT” cap, between an automation tool and a supervision tool.

A Rakefile is a Makefile where tasks and their dependencies are described in plain Ruby. rake will execute them in the appropriate order. ↩︎
The viewer is available in the Git repository in the viewer/ directory. ↩︎