Global Arrays Toolkit, Part Two
Tackle a more complex and realistic Global Arrays Toolkit program, one that performs matrix-matrix multiplication.
Tuesday, August 15th, 2006
This is the second column in the series about the Global Arrays (GA) Toolkit. The GA Toolkit, introduced in last month’s column, is an application programming interface (API) for handling shared data structures in a distributed computing environment like a Linux cluster. GA essentially provides one-sided communications for array data without requiring you to write explicit message passing code.
Like Unified Parallel C (UPC), described here in earlier columns, GA provides data distribution information to the application so that data locality can be exploited for maximize performance. While UPC offers a more familiar programming style (being implicitly parallel), the GA Toolkit works in conjunction with traditional message passing API — like the Message Passing Interface — to provide both shared-memory and message-passing paradigms in the same program.
The GA Toolkit, developed at the U.S. Department of Energy’s Pacific Northwest National Laboratory (PNNL), uses the Aggregate Remote Memory Copy (ARMCI) library, which provides general-purpose, efficient, and portable remote memory access (RMA) operations through one-sided communications. ARMCI utilizes network interfaces on clusters and supercomputers, including low-latency, high-bandwidth interfaces. GA also works in conjunction with Memory Allocator (MA), a collection of library routines that perform dynamic memory allocation for C, Fortran, and mixed-language applications. GA uses MA to provide all of its dynamically allocated, local memory.