The Global Arrays Toolkit
This month’s column introduces the Global Arrays Toolkit (GA, http://www.emsl.pnl.gov/docs/global/), a suite of application programming interfaces (API’s) for handling distributed data structures.
Saturday, July 15th, 2006
Continuing with the theme of simplifying parallel programming, this month’s column introduces the
Global Arrays Toolkit (GA,
http://www.emsl.pnl.gov/docs/global/), a suite of application programming interfaces (API’s) for handling distributed data structures.
The previous three columns discussed Unified Parallel C (UPC), an extension of C 99 that supports explicit parallel execution and a shared address space. UPC provides automatic memory management across distributed (and shared) memory hardware without requiring explicit message passing.
Like UPC, GA provides a mechanism for shared-memory style programming in a distributed memory computing environment. Unlike UPC, GA works in conjunction with traditional message passing APIs — in particular, the Message Passing Interface (MPI) — to offer both shared-memory and message-passing paradigms in the same program. In both UPC and GA, data distribution information (that is, the affinity of data to processes) is available to the application so that data locality can be exploited to maximize performance.
The GA toolkit was developed at the U.S. Department of Energy’s Pacific Northwest National Laboratory (PNNL), where it is still undergoing significant development and enhancement. Version 4.0 was recently released in April 2006. GA has been in the public domain since 1994, and is used in a number of high performance computing (HPC) simulation codes. In particular, GA is used extensively by PNNL’s NWChem computational quantum chemistry package.