The last two Extreme Linux columns provided an introduction to the Condor workload management system, gave detailed installation and configuration instructions for Beowulf clusters, and showed the details of managing and running MPI jobs (parallel programs that use the Message Passing Interface) with Condor. This month, let’s continue looking at Condor, explore some of its advanced features, and check out its powerful queuing capabilities for lots of serial tasks.
The last two Extreme Linux columns provided an introduction to the Condor workload management system, gave detailed installation and configuration instructions for Beowulf clusters, and showed the details of managing and running MPI jobs (parallel programs that use the Message Passing Interface) with Condor. This month, let’s continue looking at Condor, explore some of its advanced features, and check out its powerful queuing capabilities for lots of serial tasks.
First, let’s quickly review Condor’s basic characteristics. Condor was designed to make effective use of a pool of computing resources, whether they’re dedicated nodes in a cluster or disparate workstations distributed across a network. A Condor pool consists of a central manager and a number of other machines that join the pool as participating resources. Condor provides a job queuing mechanism, scheduling policy, job and user priority schemes, and resource classification mechanisms. It matches job requirements (job ClassAds) with advertised resource attributes (machine ClassAds) to decide where and how jobs should be executed.
Condor jobs run in one of a number of universes, where each universe has different characteristics. The vanilla universe may be used to run any serial task. The standard universe supports serial tasks, but provides automatic process migration among nodes and remote system calls (back) to the originating host. However, programs run in the standard universe must be linked with Condor’s own libraries, and may not perform some kinds of operations.
The PVM and MPI universes provide support for parallel programs, while the globus…
Please log in to view this content.
Not Yet a Member?
Register with LinuxMagazine.com and get free access to the entire archive, including: