x
Loading
 Loading
Hello, Guest | Login | Register
Today's HPC Clusters Resource Center

Cluster Management with Condor, Part 2

Last month’s column introduced Condor and presented a sample installation of the software package in a cluster environment. Condor is a system that creates a “high-throughput computing” environment by effectively utilizing computing resources from a pool of cluster nodes and disparate workstations distributed around a network. Like many batch queuing systems, Condor provides a queuing mechanism, scheduling policy, job priority scheme, and resource classification. Unlike most other batch systems, Condor doesn’t require dedicated compute servers.

Last month’s column introduced Condor and presented a sample installation of the software package in a cluster environment. Condor is a system that creates a “high-throughput computing” environment by effectively utilizing computing resources from a pool of cluster nodes and disparate workstations distributed around a network. Like many batch queuing systems, Condor provides a queuing mechanism, scheduling policy, job priority scheme, and resource classification. Unlike most other batch systems, Condor doesn’t require dedicated compute servers.

Condor continuously matches job requirements called job ClassAds, akin to classified advertisements, with advertised resource attributes called machine ClassAds. Condor jobs run in one of a number of universes, where the supported universes are standard, vanilla, PVM, MPI, globus, java, and scheduler. The standard universe supports automatic process migration among nodes for serial jobs and remote system calls on the originating hosts, but it restricts what the running programs can do. The vanilla universe provides fewer services, but has very few restrictions.

The PVM and MPI universes provide support for parallel programs written in PVM (Parallel Virtual Machine) and MPI (Message Passing Interface), specifically MPICH, respectively. The globus universe allows users to submit Globus (http://www.globus.org) jobs through Condor, and the java universe supports jobs written for the Java Virtual Machine (JVM). The scheduler universe is used internally to execute a job immediately.

A Condor pool consists of a single machine, the central manager, and a number of other machines that join the pool as participating resources. For a pool consisting of…

Please log in to view this content.

Not Yet a Member?

Register with LinuxMagazine.com and get free access to the entire archive, including:

  • Hands-on Content
  • White Papers
  • Community Features
  • And more.
Already a Member?
Log in!
Username

Password

Remember me

Forgotten your password?
Forgotten your username?
Read More
  1. Cluster 3.0: Dynamic Provisioning with MOAB and XCAT
  2. InfiniBand Interconnects for Computing Clusters
  3. Optimizing Performance for HPC: Part 2 - Interconnect with InfiniBand
  4. Optimizing the Nehalem for HPC
  5. Sledgehammer HPC