Supercomputing extremists converged on Seattle, Washington, last fall to share their experiences, exhibit their research, flaunt their wares, and award their pioneers. Supercomputing 2005 (SC05), co-sponsored by the Association for Computing Machinery (ACM) and the IEEE Computer Society, was held November 12-18 at the Washington State Convention and Trade Center in. On the home turf of both Cray and Microsoft, SC05’s “Gateway to Discovery” theme focused on the potential and future supercomputing.
SC05’s conference program again included wide-ranging technical paper presentations, tutorials, “Birds-of-a-Feather” (BOFs) sessions, panel discussions, a poster reception, and awards presentations, in addition to an ever-growing exhibition hall, which featured burgeoning booths from research laboratories, universities, software companies, hardware manufacturers, and processor and component fabricators.[ For more coverage of SC05, see “Cluster Rant” beginning on page XX.]
Papers, tutorials, and BOFs concentrated on everything from low-level algorithms to application performance on the world’s fastest supercomputers. Meanwhile, on the exhibition floor, attendees took in detailed information about research results and the latest hardware and software, in between refreshments and iPod and XBox 360 drawings.
The Oak Ridge National Laboratory (ORNL) booth at Supercomputing 2005 was just one of the many exhibits presented by research labs, universities, and hardware and software vendors. (Photo courtesy of Oak Ridge National Laboratory)
Microsoft and High Performance Computing?
Despite the meeting’s proximity to Redmond, it seemed unusual to receive a keynote speech from Bill Gates, the Chairman and Chief Software Architect of Microsoft. However, given Microsoft’s deployed numbers (a reported 800 million “nodes”) and their continuing investment in software research, it felt right to give Gates equal time. And he got a couple of free shirts for his trouble.
Chairman Gates gave one of those visionary speeches that is sweeping in nature and relies heavily on the work of others. He described three classes of computing — business computing, consumer computing, and technical computing — and drew an analogy between the growth of parallel computing and the move from a few expensive machines in business computing to lots of very inexpensive machines “working together.”
Gates acknowledged the requirement for (and challenges of) parallelism in software as the clock speeds of microprocessors increase more slowly than they have in the past, and he rightly recognized that many in the audience had been working on these techniques for a long time. He espoused a vision of scientific workflow in which the processes of data acquisition, modeling, and sorting through previous work is automated, so as to advance the state of the art in a science area.
His example came from a project called NEPTUNE, which would require lots of low-cost sensors from which large amounts of data (stored in XML) would be generated. These data would be analyzed by a variety of networked compute clusters and compared with similar results available via the Web. Exactly how it might all work wasn’t clear, but Gates suggested that strong collaboration with researchers was needed.
Gates mentioned a new multithreaded server version of Excel and a product called OneNote for laboratory notebooks before publicly introducing Microsoft’s commercial foray into Beowulf- style parallel computing. Called Windows Compute Cluster Server 2003, the new product is designed, according to Microsoft’s website, to “bring the supercomputing power of high-performance computing to the personal and workgroup level.”
Likely because of past experience, Gates left the live demonstration of this beta product to a presumably expendable underling (like one of those new, unnamed and doomed ensigns that appears at the beginning of a Star Trek episode). This risky task was performed by Kyril Faenov, Microsoft’s Director of High Performance Computing. The demo used MATLAB with distributed computing toolbox (from the MathWorks) to run a genetic algorithm that looks for genetic markers in cancer and control patients. The job ran was scheduled on an on-site Linux cluster and run on a remote Windows Compute Cluster at an Intel remote access location. Fortunately, it all appeared to work, albeit a little slowly.
Gates also announced an investment in ten Institutes for High-Performance Computing. This multiyear, multimillion-dollar investment in joint research projects at the chosen, subsdiized institutes is designed to help guide on-going software research and product innovation at Microsoft to address challenging technical computing problems. The institutes chosen to participate are Cornell University (U.S.); Nizhni Novgorod State University (Russia); Shanghai Jiao Tong University (China); Tokyo Institute of Technology (Japan); University of Southampton (England); University of Stuttgart (Germany); University of Tennessee (U.S.); University of Texas at Austin (U.S.); University of Utah (U.S.); and University of Virginia (U.S.).
Only days after the conference, it was announced that Burton Smith had resigned from Cray to accept a position at Microsoft. Smith co-founded Tera Computer in 1987, and was working to deploy its streaming technologies into future Cray systems. What this means for future Microsoft products (or even Cray products) is not yet clear.
Throughout his speech at SC05, Chairman Gates reiterated Microsoft’s commitment to working with the high-performance computing (HPC) community to enhance scientific discovery. Only time will tell how the company;re evolve and what future products may result. For now, the Beta 2 version of Windows Compute Cluster Server 2003 is available for download at http://www.microsoft.com/windowsserver2003/ccs/default.mspx
While most scientific organizations are likely to stay with Linux for their commodity clusters, Windows Compute Cluster Server may be what is needed to deliver the benefits of Beowulf-style computing to IT organizations or those who cannot seem to embrace Open Source software. But we know the origins of this technology.
The Top 500 List
The 26th edition of the “Top 500” list of the world’s fastest supercomputers was released during SC05. Four of the top ten systems from the June 2005 list were displaced by newly-installed systems, and the bottom 221 systems from the June 2005 list are now too small to be included.
Topping the list is the IBM BlueGene/L System installed at the United States Department of Energy’s Lawrence Livermore National Laboratory (LLNL) in Livermore, California. While it has occupied the top spot in the last two “Top 500” lists, the system has doubled in size over the last six months, reaching a a new record Linpack benchmark performance of 280.6 teraflops pe second (trillions of calculations per second). , this BlueGene/L contains a total of 131,072 processors.
The second-place winner is a similar but smaller BlueGene system, containing (a mere) 40,960 processors and installed at IBM’s Thomas Watson Research Center. It achieved a benchmark performance of 91.3 teraflops per second. Coming in at number three is the ASCI Purple System, also installed at LLNL, consisting of 10,240 IBM Power5 processors, with a Linpack performance of 63.4 teraflops per second.
The Columbia system at NASA’s Ames Research Center slipped to number four. The large SGI Altix system contains 10,160 Intel Itanium 2 processors, giving the system a benchmark performance of almost 51.9 teraflops per second.
Fifth and sixth position are held by two very different systems at U.S. Department of Energy’s Sandia National Laboratories.Coming in at number five is a new Dell PowerEdge
- based system with 8,000 3.6 GHz Intel EM64T Xeon
processors and an Infiniband
interconnect. With a benchmark performance of 32.27 teraflops per second, this system edged out the newly-expanded” Red Storm” Cray XT3
(also at Sandia) which contains 10,880 2.0 GHz Opteron
processors and boasts a Linpack performance of 36.19 teraflops per second. This system was described in the recent HPC issue of Linux Magazine
in November 2005 (see http://www.linux-mag.com/2005-11/cray.html
The Earth Simulator, built by NEC in Japan, has now slipped to the number seven. It previously held the number one slot for five lists. Containing 5,120 custom vector processors, the system achieved almost 35.9 teraflops per second.
Number eight is a system called” MareNostrum” located at the Barcelona Supercomputer Center in Spain. MareNostrum is an IBM BladeCenter JS20 cluster containing 4,800 2.2 GHz IBM Power PC 970FX processors and a Myrinet interconnect. The system demonstrated a performance of 27.9 teraflops per second on the Linpack benchmark.
Number nine is a 12,288 processor IBM BlueGene/L called” Stella” at the University of Groningen in the Netherlands. This system achieved a performance of 27.45 teraflops per second. And the No. 10 system is the Cray XT3 called Jaguar at Oak Ridge National Laboratory. Containing 5,200 2.4GHz Opteron processors, the system reached over 20.5 TFlop/s on the Linpack benchmark. This system sets a new entry point for the TOP10, up from just under 10 TFlop/s just one year ago.
Not surprisingly, the number of clusters has risen from 58 percent of the systems on the list in November 2004 to 72 percent of the systems, and over 48 percent of the total performance on the November 2005 list. Cluster architecture is clearly winning out over other supercomputer designs and is a strong contender for performance in many applications areas.
Even more astounding is the number of supercomputers running some form of the Linux operating system. Eight of the system in the TOP10 are running Linux;” ASCI Purple” runs IBM AIX, and the Earth Simulator uses Super-UX, both variants of Unix. In the TOP500, Linux is used on 74.4% of the systems and represents 51.8% of the total performance of the list. This is a testament to the flexibility and extensibility of the Linux operating system.
The new Top 500 list, as well as the previous twenty five lists from ast years, can be found on the Web at http://www.top500.org/
And the Winners Are…
Two of the most significant awards presented at SC05 were the Sidney Fernbach Award and the Seymour Cray Science&Engineering Award. This year’s winners were John Bell, a senior mathematician at the U.S. Department of Energy’s Lawrence Berkeley National Laboratory, and Steven Scott, the chief architect of the Cray X1 supercomputer, respectively.
Bell was named for his outstanding contributions to the development of numerican algorithms, mathematica, and computational tools, and the application of those methods to conducting leading-edge scientific investigations in combustion, fluid dynamics, and condensed matter.
Scott was named for developing a highly-scalable, distributed shared memory multiprocessor, employing custom vector processors. The Cray X1 system is based on a shared memory multi-processor capable of scaling to thousands of processors and employing custom vector units and efficient synchronization between scalar/vector pipelines and single-stream processors (SSPs) within multi-stream processors (MSPs).
See You Next Year!
If you’ve never been to a Supercomputing Conference, you should consider attending one in the future. It is a great place to learn about the latest supercomputing tools and tricks, to exchange ideas, and to share in the excitement of supercomputing. Watch as Linux clusters continue to offer the best price to performance for many computational science applications.
Forrest Hoffman is a computer modeling and simulation researcher at Oak Ridge National Laboratory. He can be reached at