A good job queuing and scheduling system is required whenever more than a couple of researchers share a Beowulf cluster. Coordinating with other users about when and where to run jobs on a shared cluster isn’t impossible, but cluster administrators quickly realize the importance of having a robust batch system once users begin competing for resources.
A good job queuing and scheduling system is required whenever more than a couple of researchers share a Beowulf cluster. Coordinating with other users about when and where to run jobs on a shared cluster isn’t impossible, but cluster administrators quickly realize the importance of having a robust batch system once users begin competing for resources.
One popular batch system, OpenPBS (Portable Batch System), was discussed in the October 2002 issue of this column (http://www.linux-mag.com/2002-10/extreme_01.html), and the Maui scheduler was covered in November 2002 (http://www.linux-mag.com/2002-11/extreme_01.html). OpenPBS consists of a job server, a job executor, and a job scheduler. Maui is an advanced batch scheduler that may be used in place of the default job scheduler provided in OpenPBS. Maui decides where, when, and how to run jobs, based on specified policies, priorities, and resource limitations.
Another batch system, named Condor (http://www.cs.wisc.edu/condor), is increasingly being used in research environments for managing compute-intensive jobs on both Beowulf clusters and disparate collections of desktops and workstations. Like OpenPBS and other batch systems, Condor provides a queuing mechanism, scheduling policy, job priority scheme, and resource classification.
However, unlike most other batch systems, Condor doesn’t require dedicated compute servers. It can harness otherwise idle machines by checkpointing and migrating jobs to those computers (when migrated and restarted, the job continues precisely where it left off). In addition, Condor can order job execution as specified by the user, and it enables grid computing by executing jobs on participating computers or…
Please log in to view this content.
Not Yet a Member?
Register with LinuxMagazine.com and get free access to the entire archive, including: