Navigating complex Puppet setups - part 3

February 1, 2015

In the previous parts of this series I have discussed various aspects of complex Puppet setups, how to make sense of the huge amounts of code, and how to set up the basics for your development environment.

If you haven’t done so already, check out part 1 and part 2 first.

This part will be all about Vagrant and using it run your own Puppet dev/test setup.

What is Vagrant and why should I care?

Vagrant provides an easy way to create portable development setups. In our case, Vagrant allows us to launch a Puppetmaster and a few Client VMs, that are identical to our production machines, with just a single command:

$ vagrant up

After running the command above, Vagrant will create the VMs using base images, or boxes, that we specified in our Vagrantfile, boot them up, wire them together, mount our code and do whatever provisioning we asked for. In other words: Vagrant does all the boring stuff while we get ourselves a cup of coffee.

But not only do you get to run your own identical-to-production Puppetmaster, you also get to break everything without getting fired ;-) And when you do, you can easily destroy the evidence, fix your code, and start over:

$ vagrant destroy -f client2
==> client2: Forcing shutdown of VM...
==> client2: Destroying VM and associated drives...
==> client2: Running cleanup tasks for 'hosts' provisioner...
$ vagrant up client2

The development setup

My main development setup consists of 1 Puppetmaster and 4 client nodes. Most of our production setup runs on Debian, but we need to support CentOS 5 and 6 as well.

When I first started using Vagrant for Puppet development my Vagrantfile looked somewhat like this:

# -*- mode: ruby -*-
# vi: set ft=ruby :

domain = 'vagrant.lan'

puppet_nodes = [

  {
    :hostname        => 'puppet',
    :master          => true,
    :ip              => '172.16.32.10',
    :fwdhost         => 8140,
    :fwdguest        => 8140,
    :box             => 'debian7',
    :boxurl          => 'http://puppet-vagrant-boxes.puppetlabs.com/debian-73-x64-virtualbox-puppet.box',
    :deployscript    => 'vagrantenv/scripts/install_puppet_master.sh',
    :memory          => 1024,
  },

  {
    :hostname        => 'client1',
    :ip              => '172.16.32.11',
    :box             => 'debian7',
    :boxurl          => 'http://puppet-vagrant-boxes.puppetlabs.com/debian-73-x64-virtualbox-puppet.box',
    :deployscript    => 'vagrantenv/scripts/install_puppet_client.sh',
  },

  {
    :hostname        => 'client2',
    :ip              => '172.16.32.12',
    :box             => 'debian7',
    :boxurl          => 'http://puppet-vagrant-boxes.puppetlabs.com/debian-73-x64-virtualbox-puppet.box',
    :deployscript    => 'vagrantenv/scripts/install_puppet_client.sh',
  },

  {
    :hostname        => 'client3',
    :ip              => '172.16.32.13',
    :box             => 'centos6',
    :boxurl          => 'http://puppet-vagrant-boxes.puppetlabs.com/centos-65-x64-virtualbox-puppet.box',
    :deployscript    => 'vagrantenv/scripts/install_puppet_client.sh',
  },
#
  {
    :hostname        => 'client4',
    :ip              => '172.16.32.14',
    :box             => 'centos5',
    :boxurl          => 'http://puppet-vagrant-boxes.puppetlabs.com/centos-510-x64-virtualbox-puppet.box',
    :deployscript    => 'vagrantenv/scripts/install_puppet_client.sh',
  },
]

# Vagrantfile API/syntax version. Don't touch unless you know what you're doing!
VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  puppet_nodes.each do |node|

    # Supports local cache, don't waste bandwitdh
    # vagrant plugin install vagrant-cachier
    # https://github.com/fgrehm/vagrant-cachier 
    if Vagrant.has_plugin?("vagrant-cachier")
      config.cache.auto_detect = true
    end

    config.vm.define node[:hostname] do |node_config|
      node_config.vm.box = node[:box]
      node_config.vm.box_url = node[:boxurl]
      node_config.vm.hostname = node[:hostname] + '.' + domain
      node_config.vm.network :private_network, ip: node[:ip]

      if Vagrant.has_plugin?("hosts")
        node_config.vm.provision :hosts
      end

      node_config.vm.provider :virtualbox do |vb|
        vb.customize ["modifyvm", :id, "--memory", node[:memory] || 512]
        vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
      end

      config.vm.synced_folder "./", "/upstream", :nfs => false

      # portforwarding guest
      if node[:fwdhost]
        node_config.vm.network :forwarded_port, guest: node[:fwdguest], host: node[:fwdhost]
      end

      if node[:deployscript]
        node_config.vm.provision :shell, :path => node[:deployscript]
      end

    end # do node_config
  end # puppet_nodes.each
end # Vagrant.configure

It did everything I wanted. It created the VMs, they were running the OS I wanted, had the memory configurations I needed, got provisioned. All was well. Or was it?

Go out and learn things

My Vagrant setup wasn’t created for just me. It was meant to be used by everyone writing Puppet code at Avisi. At first, there were only 4 of us, but that number grew considerably as we opened up our codebase to all developers at Avisi. It has been an interesting process of learning new things, improving our setup, and adapting it to the needs of the development teams, even if those needs sometimes seem unorthodox.

Some of the things we learned and/or changed:

Even though they run the same OS, the Puppetlabs boxes were different from our production servers. It mattered more than we had expected.
Don’t rely on ‘upstream’ boxes. The Debian box we were using got pulled, and we didn’t notice until one of the developers was unable to run vagrant up because the box could not be downloaded. We decided to build our own Vagrant boxes and host them on our own infrastructure.
We got rid of the shellscripts we used for provisioning, as they proved to be unreliable and slow. Instead, we used the Puppet provisioner so we can provision our boxes with the very code we are developing. This way we could use our Puppet code to provision our Puppetmaster VM, using Puppet.
Building your Puppetmaster using Puppet is as awesome as it is slow. Rebuilding our Puppet master usually took 10 minutes, even on a laptop with plenty of horsepower and fast solid-state storage. We ultimately decided to turn the provisioned Puppetmaster VM into a pre-built box as it rarely changes anyway. This way a new Puppetmaster VM is ready to use in less than a minute.
Teams wanted to use their own host definitions. For instance: they wanted to be able to use different names (e.g. projecta-web instead of client1). Also, some teams needed CentOS boxes, rather than Debian. More flexibility was needed.
Some developers wanted the option to disable provisioning at boot.

Provisioning using Puppet

After a while, the shellscripts we used to provision our boxes proved to be unreliable and slow. Also, I considered them to be somewhat of a paradox. We are favoring Puppet over the use of scripts for deploying and managing our production infrastructure, so why use scripts for deploying our Puppet development infrastructure? It seemed wrong, and it clearly called for some Puppet-ception: use the code to deploy our development setup for running the code, using the code.

To deploy your Vagrant setup using Puppet, you don’t need a Puppetmaster. You just need your code, and boxes that ship with Puppet installed. Once you have those, you can enable Puppet provisioning in your Vagrantfile:

...
node_config.vm.provision :puppet do |puppet|
  puppet.options = "--hiera_config /upstream/puppet-data/config/vagrant.lan/hiera.yaml"
  puppet.module_path = ['modules','../puppet-shared/modules']
  puppet.manifests_path= 'manifests'
  puppet.manifest_file = 'site.pp'
end
...

Vagrantfile-yoga: flexible node definitions

The Vagrantfile I started with, as shown earlier in this blog, had static node definitions. You could of course change those definitions, but you would have to discard them before committing your code. Or not, and then you would destroy someone else’s Vagrantfile. Or you could fork the Vagrantfile, but then changes to the original Vagrantfile will have to be transferred by hand. Which, of course, never happens.

So obviously, a better solution was needed. We needed Vagrantfile-yoga. Flexible in all the right places, but still balanced, stable and reliable. So I got to work. My goals:

move all node definitions outside of the Vagrantfile. They should be stored externally, preferably in YAML files.
enforce that everyone is using the same Puppetmaster.
enforce that everyone is using the correct boxes for their operating systems.
make it easy to write a custom host definition file, preferably using YAML.
make it easy to tell Vagrant which custom host definitions to use.

The resulting setup

The resulting setup was no longer a single Vagrantfile, but rather a Vagrantfile, supported by additional files stored in a vagrantenv directory:

├── Vagrantfile
├── vagrantenv
    ├── boxes.yaml
    ├── custom_hosts_example.yaml
    ├── default_hosts.yaml
    ├── projectA
    │   └── custom_hosts.yaml
    ├── projectB
    │   └── custom_hosts.yaml
    └── puppet_host.yaml

The Vagrantfile still contains all the ‘logic’. All the other files are YAML files containing various pieces of configuration data.

boxes.yaml

The boxes.yaml file contains the names of boxes we can use, and the URLs from which they can be downloaded. It looks somewhat like this:

---
puppetmaster:
  boxurl: 'http://internal.box.hosting/puppetmaster.box' 
debian7:
  boxurl: 'http://internal.box.hosting/debian7.box' 
centos6:
  boxurl: 'http://internal.box.hosting/centos6.box' 
centos5:
  boxurl: 'http://internal.box.hosting/centos5.box'

Given the fact that you can add a version number in the metadata.json file of your box, you could also specify a version for a box, but if you just want everyone to use the latest version of your box, the above will do.

The boxes.yaml file is used for 2 things: the first is checking whether a box setting in contains a valid box, the second is getting the download URL for the configured box.

puppet_host.yaml

This file contains the node definition for the Puppetmaster VM. This VM should be exactly the same for everyone, so we are managing its configuration in a separate file. It looks like this:

---
- hostname: 'puppet'
  ip: '172.16.32.10'
  fwdhost: 8140
  fwdguest: 8140
  box: 'puppetmaster'
  memory: 1024
  noprov: true

The configuration above will create a VM named ‘puppet’ using our ‘puppetmaster’ box, which contains a prebuilt Puppetmaster/PuppetDB/Puppetboard machine. It has a secondary network interface connected to a ‘host-only’ network for communication between our VMs. We will forward port 8140 to our local machine, and also set up the machine with 1GB of memory (instead of the default 512MB). Lastly, we are disabling provisioning of this VM, as it is prebuilt and does not need to be provisioned.

default_hosts.yaml

Not everyone needs custom node definitions. Some people just want one VM for each supported OS to test their code. Also, we want our setup to work out of the box, without setting up custom definitions. The default_hosts.yaml file contains the ‘sane default node definitions’ that are ready to use.

---
- hostname: 'client1'
  ip: '172.16.32.11'
  box: 'debian7'
- hostname: 'client2'
  ip: '172.16.32.12'
  box: 'centos6'
- hostname: 'client3'
  ip: '172.16.32.13'
  box: 'centos5'

Using custom node definitions

Some of the project teams at Avisi use their own node definitions. For instance, there is a team that uses CentOS exclusively (so no need for Debian). Also, their VMs need more memory for running their Java applications.

To use custom node definitions, teams no longer have to fork the Vagrantfile. They simple create a subdirectory inside the vagrantenv directory, and place their custom_hosts.yaml file inside that directory. For example, let’s see the custon definitions for ‘projectA’, stored in ./vagrantenv/projectA/custom_hosts.yaml:

---
- hostname: 'webserver'
  ip: '172.16.32.21'
  box: 'debian7'
- hostname: 'appserver'
  ip: '172.16.32.22'
  box: 'debian7'
  cpus: 2
  memory: 2048
  noprov: true
- hostname: 'dbserver'
  ip: '172.16.32.23'
  box: 'debian7'

To use their custom nodes, all they need to do is set the environment variable VAGRANT_PROJECT and use Vagrant the way they always do. So, without the environment variable set:

$ vagrant status 
Current machine states:

puppet                    not created (virtualbox)
client1                   not created (virtualbox)
client2                   not created (virtualbox)
client3                   not created (virtualbox)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.

Now, set the environment variable:

$ export VAGRANT_PROJECT=projectA
$ vagrant status
Current machine states:

puppet                    not created (virtualbox)
webserver                 not created (virtualbox)
appserver                 not created (virtualbox)
dbserver                  not created (virtualbox)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.

Fin

This wraps up this series of blogs about complex Puppet setups. However, in the ever-changing world of IT automation there’s something new to be learned every day, so I expect to revisit this subject regularly.

As for the Vagrant-yoga setup: I am currently rewriting parts of it so it can be used for other things as well. I plan on releasing my setup later on.