The State of Open Source System Automation

The days of DIY system administration are rapidly coming to a close. Why? Because the open source tools available are just too good not to use. Presenting Bcfg2, Cfengine, Chef and Puppet.

Chef

Aaron Peterson, seasoned systems engineer, Technical Evangelist of Opscode, Inc.,presented. Chef is primarily a configuration management library system, andsystem integration platform (help integrate new systems into existing platforms.)

Chef grew out of power user dissatisfaction with aspects of Puppet. Made available in 2009, Chef is in beta (version 0.9.x.x) and is settling down now.

Key Principles of Chef’s Design:

  • The cloud is the future, be able to operate in the cloud. (For example,automated provisioning of new server instances.)
  • Fully automated infrastructure is hard. Make it easier through configuration sharing and re-use. Chef is a library for CM (ora CM system built on that library).
  • Facilitate integration of new servers into an existing platform.
  • Describe the end state, and Chef will get you there and keep you there. However, Chef only takes actions when they differ from the description of the end product.
  • Reasonability (easy to think about).
  • Sane defaults (yet easily changed).
  • Hackability (easy to extend).
  • TMTOWTDI (There Is More Than One Way To Do It).
  • Pragmatism. Chef’s language is Ruby with some DSL. The Ruby mix allows you to include programming.
  • Enable infrastructure as code to benefit from software engineering practices such as agile methodologies, code sharing through github, release management, etc.
  • Enable you to solve your problem.
  • Data-driven. Configuration is just data.
  • Cultivate a culture of quality and reusability through Chef cookbooks. Example freely available cookbooks: Postgresql, tomcat6, Apache2, Kickstart, OpenSSL, etc.
  • Chef exposes data and behavior over HTTP to enable integration with external tools.
  • Recipes are run in order. Nodes have a run list: what roles or recipes to apply, in order.

Example: Run List

"run_list": [
  "role[webserver]",
  "role[database_master]",
  "role[development]"
]

Chef: Infrastructure is Code

Chef Recipe: Install the Postfix Package

package "postfix" do
  action :install
end

Chef Recipe: Install Multiple Packages

package "apache2" do
  action :install
end

package "apache2-mod_php5" do
  action :install
end

package "php5" do
  action :install
end

Chef Recipe: Upgrade sudo package and configure /etc/sudoers using a template

package "sudo" do
  action :upgrade
end

template "/etc/sudoers" do
  source "sudoers.erb"
  mode 0440
  owner "root"
  group "root"
  variables(
    :sudoers_groups => node[:authorization][:sudo][:groups],
    :sudoers_users => node[:authorization][:sudo][:users]
  )
end
  • Manage configuration as resources.
  • Put them together in recipes.
  • Track it like source code.
  • Configure your servers.

The above example is two part, a package resource recipe, and an accompanying file template recipe.

Under the hood, a Chef Provider will handle the required action to bring the resource into the described state. Example provider for the above:Chef::Provider::Package::Yum.

Recipes are lists of resources (files, packages, processes, filesystems, users, etc.)

Cookbooks are packages of recipes.

How Does Chef Work?

  • Install Chef. Install all the Ruby gems/dependencies.
  • Create your Chef repository

Beginner’s Cookbook:

git clone  git://github.com/opscode/chef-repo.git

Advanced Version:

git clone  git://github.com/opscode/chef.git

Chef assumes you start from a base OS. (Which is particularly true in cloud environments, where it’s financially feasible for servers to have brief lifetimes.)

  • Fire up Chef Server and Chef Client. Managed clients do most of the work.(Or else you can use Chef Solo which can run recipes without a server).

Chef prefers failure over non-deterministic “success” when something goes wrong.If it cannot complete the recipe, in full, and in sequence, that is failure.This is one of the primary things differentiating it from Puppet, where ordering is non-deterministic and a policy may be fulfilled partially but it can be hard to predict which parts get fulfilled.

Example of Node Attributes

default[:nginx][:gzip] = "on"
default[:nginx][:gzip_http_version] = "1.0"
default[:nginx][:gzip_comp_level] = "2"
default[:nginx][:gzip_proxied] = "any"

default[:nginx][:keepalive] = "on"
default[:nginx][:keepalive_timeout] = 65

default[:nginx][:worker_processes] = cpu[:total]
default[:nginx][:worker_connections] = 4096

Limitations (or Strengths, depending how you look at it)

Chef is written primarily for Ubuntu (handling Debian packages); there is also support for Red Hat but it lags.

There are three cloud providers supported by “knife”, the Chef command line tool: Rackspace, Amazon EC2, Terremark.

Example: instantiate a server; configure it as a Ruby-on-Rails Web app server:

knife rackspace server create 'role[rails]'
knife ec2 server create 'role[rails]'
knife terremark server create 'role[rails]'

Configuration Management Tip from Aaron Peterson

Never depend on a single sysadmin’s knowledge. Put all the knowledge into a Chef cookbook.

Web UI

Because Chef exposes an HTTP API, there is a Web UI. It is entirely optional;anything that can be done through the Web UI (such as adding users or roles) can be done on the CLI with “knife”. The commercial platform has some additional Web UI features, but again, the primary use is expected to be via libraries, following the “infrastructure as code” paradigm.

Next: Puppet

Comments on "The State of Open Source System Automation"

redmumba

We run Bcfg2 pretty extensively at our offices, and it certainly has its pluses and minuses. However, one of the things that is a real stick in the side is TGenshi, the Bcfg2 templating system. One of the great things about TGenshi is, well, it allows you to add logic to your file–so you can generate files from the Properties plugin, dynamically encrypt passwords, etc.. Great feature, right?

Debugging is AWFUL. The errors TGenshi throws by defaulty largely generic; for example, if you have a 100 line Python file being run in the template, and an error occurs anywhere, you’ll just get a message saying “Could not generate this file.” No line number, no raising of the original Python exception, nothing. If you want to do any serious work, you’ll have to write your own wrapper to catch errors–or at least a line number for what failed.

Bcfg2′s strongest feature is keeping everything the same on every server, so I would consider combining this for day-to-day maintenance, and maybe Puppet or cfengine for deployment.

Andrew

Reply
jblaine

We’ve been using cfengine 1+2 for 11 years.

Reply
vicente

we use cfengine2 with some logic of our own to control around 130 computers, and is very nice and powerfull, when you get to understand it.

Now we are thinking to get in puppet. I’d like to post soon to tell you how it was.

Reply

    We just ran into the sccm cenlit performance issue. We had a script blow out the sccm cenlit to the domain, and the support team did this on sunday at about 1am. Every sunday at 1am since then our VMware farm grinds to a halt. Cpu spikes, storage spikes.We found that sccm was launching a dir /s inventory on all of its cenlits on the 7 day aniversary. About 200 vms.Still not sure how we will fix this, but ideas would be appreciated.

    Reply

Definitely would love to start a website like yours. Wish I had the time. My site is so amateurish compared to yours, feel free to check it out: http://tinyurl.com/o55af8p Alex :)

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>