This month’s column introduces the Global Arrays Toolkit (GA, http://www.emsl.pnl.gov/docs/global/), a suite of application programming interfaces (API’s) for handling distributed data structures.
While implementing and managing a powerful and complex cluster environment can seem like a daunting task, you can make your life much easier and your users more productive by sticking to a few simple rules. Here’s a guide.
Designed for high-performance computing on large-scale parallel machines, including Beowulf-style clusters, Unified Parallel C provides a uniform programming model for both shared and distributed memory hardware.
A new, federally-funded prograim called the High Productivity Computing Systems Initiative aims to to make high-performance computing more afforadable and more accessible to more scientists. Here’s a look at the goals of the project, including an example of a better way to program in parallel today.
Supercomputing extremists converged on Seattle, Washington, last fall to share their experiences, exhibit their research, flaunt their wares, and award their pioneers.
To date, the Message Passing Interface has been instrumental in simplifying application development for clusters. But as clusters change to embrace multiple cores, multiple platforms, and multiple advanced interconnects, MPI is no longer adequate. What can replace it? Donald Becker asks, “How about Unix?
If you need to monitor and manage such a configuration, try syslog-ng (syslog, next generation), a drop-in replacement for syslogd. syslog-ng provides more sophisticated log management capabilities and enables log transfers over the Internet.
With advances in many facets of networking, diskless clusters are now quite practical. Better yet, nodes without local storage are cheaper to build and cheaper to maintain. Heres a survey of the relevant technologies and techniques.
Our coverage of Linux-based cluster distributions continues this month with OSCAR, the Open Source Cluster Application Resource software bundle available free from the Open Cluster Group.
We continue our coverage of Linux-based cluster distributions by delving into Rocks, a free and customizable distribution for commodity platforms funded by the National Science Foundation and distributed by the San Diego Supercomputing Center.
The Clustermatic Linux distribution, produced by the Cluster Research Lab at Los Alamos National Laboratory, is a collection of software packages that provides an infrastructure for small- to large-scale cluster computing.
This month’s column focuses on building and using Beowulf Distributed Process Space (BProc) software used by the commercial Scyld Beowulf and the Clustermatic Linux distributions for high performance computing (HPC) clusters.
Last month's "Extreme Linux" introduced MPI-2, the latest Message Passing Interface (MPI) standard. MPI has become the preferred programming interface for data exchange -- called message passing -- for parallel, scientific programs. MPI has evolved since the MPI-1.0 standard was released in May 1994. The MPI-1.1 standard, produced in 1995, was a significant advance, and the MPI-2 standard clarifies and corrects the MPI-1.1 standard while preserving forward compatibility with MPI-1.1. A valid MPI-1.1 program is a valid MPI-2 program.
The Message Passing Interface (MPI) has become the application programming interface (API) of choice for data exchange among processes in parallel scientific programs. While Parallel Virtual Machine (PVM) is still a viable message passing system offering features not available in MPI, it's often not the first choice for developers seeking vendor-supported APIs based on open standards. Of course, standards evolve, and the MPI standard is no different.
A little over a year ago, Silicon Graphics, Inc. (SGI, http://www.sgi.com) announced a new 64-bit supercomputing platform called the Altix 3000. In a break from its tradition of building large machines with MIPS processors running the IRIX operating system, the Altix uses Intel's Itanium 2 processor and runs -- you guessed it -- Linux. Unlike Beowulf-style Linux clusters, SGI's cache-coherent, shared-memory, multi-processor system is based on NUMAflex, SGI's third-generation, non-uniform memory access (NUMA) architecture, which has proven to be a highly-scalable, global shared memory architecture based on SGI's Origin 3000 systems.
The last few "Extreme Linux" columns have focused on multiprocessing using OpenMP. While often used in scientific models for shared memory parallelism on symmetric multi-processor (SMP) machines, OpenMP can also be used in conjunction with the Message Passing Interface (MPI) to provide a second level of parallelism for improved performance on Linux clusters having SMP compute nodes. Programs that mix OpenMP and MPI are often referred to as hybrid codes.
This is the third and final column in a series on shared memory parallelization using OpenMP. Often used to improve performance of scientific models on symmetric multi-processor (SMP) machines or SMP nodes in a Linux cluster, OpenMP consists of a portable set of compiler directives, library calls, and environment variables. It's supported by a wide range of FORTRAN and C/C++ compilers for Linux and commercial supercomputers.
This month, we continue our focus on shared-memory parallelism using OpenMP. As a quick review, remember that OpenMP consists of a set of compiler directives, a handful of library calls, and a set of environment variables that can be used to specify run-time parameters. Available for both FORTRAN and C/C++ languages, OpenMP can often be used to improve performance on symmetric multi-processor (SMP) machines or SMP nodes in a cluster by simply (and carefully) adding a few compiler directives to the code. Most commercial compilers for Linux provide support for OpenMP, as do compilers for commercial supercomputers.
In this column's previous discussions of parallel programming, the focus has been on distributed memory parallelism, since most Linux clusters are best suited to this programming model. Nevertheless, today's clusters often contain two or four (or more) processors per node. While one could simply start multiple MPI processes on such nodes to use these processors, taking best advantage of the hardware requires a different approach. Processors within a node typically share all the memory within that node, and they can communicate much more quickly with each other than with processors on other nodes.
The last two Extreme Linux columns provided an introduction to the Condor workload management system, gave detailed installation and configuration instructions for Beowulf clusters, and showed the details of managing and running MPI jobs (parallel programs that use the Message Passing Interface) with Condor. This month, let's continue looking at Condor, explore some of its advanced features, and check out its powerful queuing capabilities for lots of serial tasks.
Last month's column introduced Condor and presented a sample installation of the software package in a cluster environment. Condor is a system that creates a "high-throughput computing" environment by effectively utilizing computing resources from a pool of cluster nodes and disparate workstations distributed around a network. Like many batch queuing systems, Condor provides a queuing mechanism, scheduling policy, job priority scheme, and resource classification. Unlike most other batch systems, Condor doesn't require dedicated compute servers.
A good job queuing and scheduling system is required whenever more than a couple of researchers share a Beowulf cluster. Coordinating with other users about when and where to run jobs on a shared cluster isn't impossible, but cluster administrators quickly realize the importance of having a robust batch system once users begin competing for resources.
Every cluster builder wants to know how fast his or her computer is. After all, speed is the primary reason to build a Linux cluster -- aside from the gains in data capacity and resource redundancy. But how do you measure speed? CPU clock speed is one thing, but using those cycles to actually get work done is quite another. And in a cluster, the network connections between nodes can quickly become a crippling bottleneck.
Monitoring the status of a Beowulf-style cluster can be a daunting task for any system administrator, especially if the cluster consists of more than a dozen nodes. While Linux is extremely stable, hardware problems can cause nodes to crash or become inaccessible, and chasing down problem nodes in a 500-node cluster is painful. Luckily, some sort of statistical resource monitoring can often yield early warnings of impending hardware failures.