The State of Open Source System Automation

The days of DIY system administration are rapidly coming to a close. Why? Because the open source tools available are just too good not to use. Presenting Bcfg2, Cfengine, Chef and Puppet.

Puppet

Michael de Haan of Puppet Labs, and formerly of Red Hat engineering, presented.

Key Principles of Puppet’s Design:

  • Puppet is centralized.
  • Puppet internal logic is graph based. It uses decision trees and reports on what it was able to do; and what failed (and everything after it). Manual ordering is very important,as decision trees will be based on it. Ordering is very fine-grained.
  • Puppet language is a data center modeling language representing the desired state. The Puppet language is designed to be very simple and human readable. This prevents you from inserting (Ruby) code but it also makes it safer (prevents you from shooting yourself in the foot). However you can still call external (shell) scripts.Also, an upcoming version (2.6) will support programming in a a Ruby DSL.
  • Portability. Works anywhere Ruby works.
  • Pluggability. Puppet does not allow arbitrary language in the code;however there is pluggability:server side functions can interact with an external data source (e.g.query database or read a textfile). There is a feature called “external_nodes” which you can enable on the puppet server (the puppetmaster) which will kick in whenever a puppet client (puppetd)connects. Instead of having the node name and its class membership and attributes stored in your puppet config, you can have it stored in an external database, and “external_nodes” will fetch that info.

How does Puppet Work?

Puppet only performs actions that are necessary. The basic formula for Puppet’s operation is: server-side, poll information from the client then decide what to do and tell the client what to do. In detail:

The server gets the client to tell the server about itself. These are facts in Puppet. The configuration policies are the manifests.

The server compares the facts (what is) to the manifests (what should be), and, if necessary, creates instructions to the clients on the managed nodes for moving from what is to what should be. These instructions are encoded as a JSON catalog.

Manifests + Facts -→ JSON catalog -→ Nodes

The JSON catalog contains a declarative description about desired state, and the client then runs that catalog to achieve the desired state.

Puppet is pre-installed on Ubuntu (cloud and main editions).

If a service subscribes to a file, and the file changes, the service will know it automatically needs to restart. For example:

service { 'sshd':
  ensure => running,
  subscribe => File['/etc/ssh/sshd_config'],
}

file { '/etc/ssh/sshd_config':
  ensure => present,
  source => puppet:///sshd/sshd_config,
  owner => root,
  group => root,
}

Puppet Language

Resource Types are the building blocks of Puppet configuration.Here is a simple example:

file { "/etc/passwd":
    owner => root,
    group => root,
    mode => 644
}

This is the “file” resource type. It controls ownership and access permissions to the named file.

Providers are what make the resource type an actuality; or it’s the part of Puppet that actually executes the configuration, the interface between the resource description and the OS; the “doer”.

There can be multiple providers for a resource, for example you might specify mod-php package be installed, and it could by installed by package providers for dkpg, rpm, yum, openbsd, and so on. The most appropriate provider will be picked automatically; or you can specify certain features in the resource type, and then the providers will be probed for what features they support.

There is an advanced and experimental feature “exported resources”that allows one host to configure another host (in Puppet terms, it allows resources to move between hosts) — this allows inter-node orchestration.

Puppet, of course, can export reports.

What Lies Ahead? What Are the Challenges in Configuration Management?

Narayan: “Configuration meta-programming” or “multi-node orchestration”.For example: “NTP clients should talk to our NTP servers”, or”the ssh_known_hosts file should contain entries for all machines”,or “the load-balancer should direct traffic to all production Web servers”.

Mark Burgess: Including network devices in configuration management;manipulating mechanical devices (such as controlling satellite position in Earth orbit); most importantly, knowledge management (tracking state, understanding intentions, aligning with business goals). Mark is working on tying Cfengine with ISO13250 Topic Maps.

Next: Quick Comparisons

Comments on "The State of Open Source System Automation"

redmumba

We run Bcfg2 pretty extensively at our offices, and it certainly has its pluses and minuses. However, one of the things that is a real stick in the side is TGenshi, the Bcfg2 templating system. One of the great things about TGenshi is, well, it allows you to add logic to your file–so you can generate files from the Properties plugin, dynamically encrypt passwords, etc.. Great feature, right?

Debugging is AWFUL. The errors TGenshi throws by defaulty largely generic; for example, if you have a 100 line Python file being run in the template, and an error occurs anywhere, you’ll just get a message saying “Could not generate this file.” No line number, no raising of the original Python exception, nothing. If you want to do any serious work, you’ll have to write your own wrapper to catch errors–or at least a line number for what failed.

Bcfg2′s strongest feature is keeping everything the same on every server, so I would consider combining this for day-to-day maintenance, and maybe Puppet or cfengine for deployment.

Andrew

Reply
jblaine

We’ve been using cfengine 1+2 for 11 years.

Reply
vicente

we use cfengine2 with some logic of our own to control around 130 computers, and is very nice and powerfull, when you get to understand it.

Now we are thinking to get in puppet. I’d like to post soon to tell you how it was.

Reply

    We just ran into the sccm cenlit performance issue. We had a script blow out the sccm cenlit to the domain, and the support team did this on sunday at about 1am. Every sunday at 1am since then our VMware farm grinds to a halt. Cpu spikes, storage spikes.We found that sccm was launching a dir /s inventory on all of its cenlits on the 7 day aniversary. About 200 vms.Still not sure how we will fix this, but ideas would be appreciated.

    Reply

Definitely would love to start a website like yours. Wish I had the time. My site is so amateurish compared to yours, feel free to check it out: http://tinyurl.com/o55af8p Alex :)

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>