While implementing and managing a powerful and complex cluster can seem like a daunting task, you can make your life much easier and your users more productive by sticking to a few simple rules.
Help Your Users Help Themselves
Educated cluster users are happy users, and happy users make for happy administrators. Moreover, educated clusters users are more productive, which frees administrators to focus on higher-level problems.
Administrators often notice that shortly after users begin working with a cluster, similar questions start to pop up. Although it requires an initial investment of your time, it’s a good idea to offer a series of short classes and tutorials to familiarize your users with their cluster. It’s also a good idea to provide all notes, examples, and references online, where the material can readily be accessed on-demand.
A user reference web site should include a series of walkthroughs, demonstrating common scenarios or workflows that are likely to be performed on a regular basis. For example, the site should include examples for using special compilers, submitting jobs, monitoring jobs, and even a short tutorial on debugging. Make your walkthroughs as complete as possible. The site should be updated frequently, especially when any software is changed or added. The general rule is to post a notice when any change affects the way your users interact with the cluster.
Inevitably, your users will require some hand-holding, but a well-crafted, internal “support” site makes things easier on everyone. To address as many questions as possible — from users and other administrators alike — create a clear set of “frequently asked questions” (a FAQ). Post the FAQ to your reference site, and like the site, keep the FAQ up-to-date.
The web site should also provide links to external documentation to allow users to learn more about various topics and find detailed information on specific concepts, if they so desire. One of the benefits of external links is that it encourages users to take examples they see in the tutorials (or even right from the web) and extend them.
A well written website can even become a project “notebook” of sorts to help staff answer questions such as “What was I thinking there when I built that libraries?” or “What is that compiler switch to turn on SSE optimizations?”
Your internal site is a place where users can teach themselves. Teach your users to look at the web site first before coming to you with issues or questions. As the proverb says, rather than give the fish, teach users to catch the fish.
Make It Easy, and Stay Consistent and Logical
Typical clustering environments are diverse and complex. It’s not uncommon to find three or four different compliers and countless other auxiliary libraries on a shared cluster system.
One way to simplify configuration issues is to install packages in their default location. However, that approach only works until one package conflicts with another. For instance, a simple upgrade of a library might cause one user’s code not to function anymore, if that code depends on a specific version of a particular library. (Dependencies on compiler releases are also quite commonplace.)
Even a modicum of compiler versions and library revisions lead to a huge number of possible combinations. With multiple revisions of software packages available, users can quickly becomes confused, scripts can become ungainly, and even system administrators may not be able to keep up with the variations.
A new system for naming and utilizing tools can resolve this problem. A special naming convention should be formed around the type of tools installed, such as compilers, libraries, and debuggers. Install each kind of tool into its appropriate root directory, such as /opt/compilers, /opt/libraries, and /opt/debuggers, then break each category down further into vendors, architectures, and versions.
Here’s an example: the NetCDF
) is a data storage library typically used in scientific applications. Unfortunately, a unique version must be built for each compiler, and in both 32- and 64-bit versions. A reasonable naming convention is /opt/libaries/netcdf/<compiler>/<bitwise>/<netcdf version number>
A pattern similar to this will, at the very least, allow a user to make an educated guess as to the location and version of the library he or she needs. Of course, you should post all of this information online for your users to reference.
You can concoct your own naming scheme, but the principal is one of least surprise: make the locations of tools consistent and logical.
Reduce Complexity and Create Seamless Transitions
Organizing and grouping your libraries into well-advertised locations is convenient. But what about advanced users who require the ability to easily switch between applications, libraries, and compilers?
In many cases, commercial and freely-available applications need to have custom environment variables set to function, so the problem then becomes educating these users about setting up their environments appropriately.
can solve this problem. Users can load and unload modules into their environments so environment variables are set up correctly and consistently. (Cray Inc. developed this idea several years ago, and so the concept of environmental modules was dubbed “Cray Modules.” This approach was so useful that innovators in the open source community came up with a modules package for Linux.
) The Linux version is called env-switcher
and is available at http://env-switcher.sourceforge.net/
The env-switcher package was originally written to be part of the Oscar Clustering Project, but can be adapted into any clustering system. The env-switcher package allows the administrator to write a new modules as a short Tcl script that users can load and unload into their environment to provide a seamless transition between compilers, libraries, and even applications.
Take for example the common situation of when a cluster has both MPICH
) and LAM-MPI
) loaded on it. In this case, MPICH is the library of choice (the default), thus no changes to the default environment have to be made to use MPICH. However, if a user wants to use LAM, and specifically LAM version 7.1.4,
the user may have to make a few changes as follows:
$ export PATH=/usr/lam/7.1.4/bin:$PATH
$ export LAMHOME=/usr/lam/7.1.4
$ export LD_LIBRARY_PATH=/usr/lib64/lam/7.1.4/gnu
The first line simply sets PATH to use LAM binaries instead of the system default. Second, the LAMHOME required by LAM is set. The runtime loader is then configured to use the GNU LAM libraries, as opposed to the system default.
The env-switcher suite allows an administrator to write a module, making it easier for the user to switch between MPICH and LAM. Using env-switcher the previous example would be collapsed down into a single line:
$ module load lam/7.1.4
When a user needs to revert back to an older version of LAM, the value of env-switcher is even clearer. Without env-switcher, a user would have to remember each change to the environment. This would make each change more prone to error. With env-switcher the task is as simple as:
$ module switch lam lam/7.0.4
If a user wants to revert back to the system default, instead of resetting each of the variables, env-switcher allows the user to just unload the module:
$ module unload lam
While this is just a small example, it should demonstrate how env-switcher allows users to seamlessly load, unload, and switch between different environments and tool chains. As your cluster environment grows into more complicated configurations, this type of tool proves to be invaluable resource for administrators and users.
Learn to Swim in a Sea of Statistics
A large amount of data can be gathered from a running cluster. In addition to standard statistics, such as CPU, memory, and disk usage, you can also gather data on CPU fan speeds, chassis fan speeds, voltages, and temperatures. The tendency for many new cluster administrators is to want to monitor all of the various statistics in their new system. However, they soon realized that many of these statistics are generally not important for day-to-day maintenance and operation.
Many monitoring packages, both open source and commercial, often try to display and graph every possible piece of data. While the ability to probe certain values from time to time can be useful when investigating failures, too much information can become overwhelming. For example, while a screen that displays CPU usage for each compute node in the cluster is helpful in easily recognizing under-performing nodes, a similar screen displaying CPU voltages is much less useful and provides little value other then telling the viewer the CPUs have power.
The difference between each of these metrics can be compared to analog and digital signals. The CPU usage is an example of an analog value that one would expect to change over a short period of time. It might be useful to know when a CPU is performing at sixty percent versus one hundred percent. However, digital values such as CPU voltages, fan speeds, and even temperatures, however, should only be a concern when they drop below a particular threshold. So when choosing a monitoring package, it’s important to evaluate what statistics are important to your cluster implementation and which are less valuable.
Focus on the information that matters most. Generally, all that needs to be presented are metrics for CPU, memory, disk, and network utilization, as these are the only metrics that regularly change. Regularly gathering, charting, and displaying any other information will only add unnecessary overhead to your cluster and distract yourself and your users from what really matters.
Consider the Costs/Benefits
Making some statistics available to end-users reassures them that the cluster is performing their requested jobs. For that reason, the monitoring package should also be capable of presenting the data in form that is easy to understand. The popular, open source Nagios
) and several other monitoring packages can be configured to send alerts when values exceed designated parameters. However, this does not provide a graphing display as powerful as that of Ganglia,
another popular open source monitoring tool (http://ganglia.sourceforge.net/
You may be considering using both of these tools on your cluster to provide a total solution, but mind the overhead associated with running additional tools. Each additional tool can add significant overhead to your network, sending many large packets each time an update is triggered. Monitoring tools such as Ganglia send XML data across the network for each node tracked. The size of these XML packets is comparatively large and results in a significant increase in network traffic. This extra overhead quickly adds up and can decrease an application’s performance, especially as the number of nodes in your cluster begins to grow to 32 nodes or more.
Since Linux clusters rely on commercial-off-the-shelf hardware, cluster vendors focus on software and related areas for differentiation. To do this, some cluster vendors merely re-package open source tools such as Nagios and Ganglia without their own unique implementation. This doesn’t any real value to the tools because they are often branded with the company’s name with only a few, minor features added. Beware of these sorts of tools. The work done in these areas is important and these are valuable technologies. However, similar, if not identical features are available in open source equivalents.
If you want the technical support and benefits of pre-integration of commercially supported clusters, look for a vendor with a second generation cluster or at least one that has a cluster with strong software implementation of open source tools. A value-added implementation of Ganglia, for example, might be enhanced to take advantage of the cluster software’s own libraries to allow additional monitoring packages to be added to the system without incurring any additional network overhead. Or, it might record statistical data for the compute nodes in a shared memory region on the master node, so that any application that wants access node information can simply read the data out of the memory region without creating any extra network traffic.
The bottom line is just make sure that the impact your potential monitoring package will have on application performance gives back a commensurate administrative benefit.
Some Simple Rules
While implementing and managing a powerful and complex cluster environment can seem like a daunting task, you can make your life much easier by sticking to some simple rules: Empower your users; be consistent in you documentation and schemes; make transitions seamless; monitor just that data that matters; and consider the costs and benefits of monitoring packages.
Joshua Bernstein is a Software Engineer for Scyld Software currently living in San Francisco. When he isn’t contributing to open source projects such as Samba, OpenLDAP, and MythTV, he enjoys tinkering with remote control cars and mountain biking with his girlfriend, Shiela. Formerly, Josh was a system administrator at the University of Arizona Lunar and Planetary Lab. Josh can be reached at