Long Live the Top 500

Those who say that the “Top 500” (http://www.top500.org/) is nothing more than “a high-tech pissing contest” are wrong. As the history of the Top 500 has shown, “The List” has an unfailing ability to predict the future of and inexorably impact plebeian, everyday computing.
Those who say that the “Top 500” list (http://www.top500.org/) is nothing more than “a high-tech pissing contest” are wrong. The naysayers are skeptics in a skeptical age who believe that the Top 500 is only a marketing tool, a list to maintain bragging rights, and a measurement of mano-a-mano raw processing power without a foundation in practical computing. In fact, as the history of the Top 500 has shown, “The List” has an unfailing ability to predict the future of and inexorably impact plebeian, everyday computing.

The History of “The List”

The Top 500 list tracks the 500 most powerful computers in the world, as reported to the Top 500 project. The List, which debuted in 1993, was originally developed to help scientists involved in high-performance computing (HPC) understand long term trends in computing. Since 1993, The List has been published twice a year– usually to great fanfare– at one of the worldwide supercomputing conferences.
Of course, there’s a considerable amount of pride in making The List, in moving up in rank, and in capturing the coveted number one spot. Clearly, vendors love The List, because it validates products and strengthens claims. What vendor wouldn’t thrill in manufacturing the world’s fastest (or most rapidly evolving) computer?
But how do you measure fastest? From the beginning, it was clear that an impartial method of measuring performance was needed. The LINPACK benchmark (http://www.netlib.org/linpack/), first developed in 1979, was chosen due to its wide availability and suitability for the purpose. LINPACK measures how quickly a computer can solve a dense system of linear equations using the floating point compute power of the system.
The List calculates rankings based on two numbers: the High Performance LINPACK (HPL) benchmark (see the sidebar “The LINPACK Benchmark” for more information) and the theoretical peak number. While HPL is a true benchmark, the theoretical peak (or Rpeak) is not a true benchmark number achieved by running tests on an actual computer. Instead, Rpeak is a paper and pen “speed of light” number arrived at by multiplying the manufacturer’s maximum floating point number by the number of nodes– there’s no consideration for any other element of the system.


But not everyone is happy using the HPL LINPACK and Rpeak numbers as the final arbiters of “fastest computer.” Opponents argue that while LINPACK HPL and Rpeak are interesting numbers, neither truly reflects the performance of a production system. Moreover, critics contend that so many elements affect attainable system performance (think network speed, memory, algorithm, compilers, profiling, and tuning) that measuring just floating point performance is kind of like the Yankees buying a pitcher with a 100 MPH fastball that can’t find home plate. Both are irrelevant. Indeed, some pundits go so far as to say that measuring an HPC system’s capabilities on a single number is simply disingenuous.
There are several efforts underway to come up with more meaningful, contextual numbers for HPC performance. Perhaps the most interesting of these efforts is the High Performance Computing Challenge Suite (HPCC). HPCC, led by the original creator of LINPACK, Jack Dongarra, is charged with developing a wide range of tests that would measure a system’s performance under a series of different conditions. As Dongarra says:
For a long time it’s been clear to all of us[that]we need to have more than just LINPACK. No single number can reflect the overall performance of a machine.
The biggest obstacles to widespread adoption of these relatively new performance measurement suites are inertia– everyone already knows and understands the current Top 500 procedure– and ubiquity, or getting the measurement software running on every system so that valid side-by-side comparisons can be made.

The Mysteries of The List, Revealed

Whether the measurement is LINPACK or some other HPCC suite in the future, there are great truths that have been revealed in The List over the past twelve years since its inception. While there are valid arguments that can be made about the position of a single system on the list in a single year, the original intention of The List, to reveal significant HPC trends over time, is clearly satisfied by comparing the composition of the list over a number of years.
1.It’s not your father’s Intel chip anymore…
Some of us are old enough to remember early computing prior to the x86 family of processors. Back in 1978, Intel introduced the 8086 processor. The original 8086 was a 16-bit microprocessor with four 16-bit general purpose registers and four 16-bit index registers. It had no floating point registers and ran at the scintillating clock speed of 4.77 MHz. Back in 1978, who could have guessed that the “lowly” x86 family would come to dominate the list of fastest supercomputers in the world?
Two powerful forces combined to catapult the x86 into the forefront of supercomputing: Moore’s Law and commoditization. Moore’s Law, based on an observation made by Gordon Moore (cofounder of Intel) in 1965, claims that transistors become twice as dense every year, a trend that, in 1965, seemed to be guaranteed far into the future. While Moore himself has conceded that the doubling now takes more on the order of 18 months, the premise remains that density is doubling at an alarming rate, and with it, computer speeds.
While Moore’s Law affected the speed of the processor, commoditization made it cheap and reliable. Recall that the original PC introduced by IBM in the late 1970s used the 8086 family of processors, and soon, x86s were everywhere. As Greg Pfister points out in his book In Search of Clusters, it was the ability to string together a bunch of cheap, reliable, and increasingly faster commodity processors that created a bourgeios yet novel computing model that could directly compete with the most exotic, most expensive architectures.
If you examine the growth of the Intel processor on the Top 500 over the past five years, from June 2000 to June 2005 (November 2005 was not yet available as of this writing), you’ll see an increase from four systems in 2000 to 333 systems in 2005, a compound annual growth rate (CAGR) of 142 percent during the five-year period. (See Figure One.) Further, the once lowly Intel processor is now the dominant processor family on the list, comprising two thirds of the Top 500. The Intel family now includes the direct x86 descendants, like the Pentium 4 Xeon and the 64-bit version, the EM64T, which still supports the 32-bit instruction set. The Intel processor family also includes the IA-64 (Itanium), which is a completely different 64-bit animal. Even if you discount the 79 Itanium 2 systems from the 333 “traditional” Intel systems, you still arrive at 254 x86 vintage systems on the Top 500 List, or just over half. In just five short years, the trend is apparent: Intel has expanded from just the commodity PC market to the highest levels of computing.
FIGURE ONE: The constituents of the Top 500 list, by processor

2.The rise of clusters
Now, what good is a ubiquitous processor without a way to harness the power? Enter clustering, today’s preferred model for capitalizing on the power of commodity chipsets.
The first cluster was created in 1994 by Donald Becker and Thomas Sterling. Their “Beowulf” consisted of sixteen DX4 processors and channel-bonded Ethernet. Today, clusters pervade many segments of computing, from web serving, to high-availability, to database serving, to the original segment of crunching numbers.
Indeed, clusters have become the most popular architecture type on the Top 500, encompassing both commodity and non-commodity processor types. (See Figure Two.) Looking at the same five-year period, June 2000 to June 2005, clusters went from just eleven members of the Top 500 to 304 today. Just as telling, SMP’s went from 139 to 0 over the same time period; MPP’s declined from 257 to 117, and constellations held nearly steady, dropping slightly from 93 to 79.
FIGURE TWO: The constituents of the Top 500 list, by architecture

While the Top500 doesn’t maintain statistics for operating systems, it’s no coincidence that the rise of commodity hardware clusters coincides with the explosion of the popularity of open source over the past five years. Many universities and national labs, long the bleeding-edge of HPC, have adopted an “open from the ground up” philosophy to avoid vendor tie-in. This trend, while not officially measured, is clearly reflected in the makeup of the Top 500.

Other Trends

Besides the rise of Intel and commodity processing and the rise of the cluster model, there are many other great truths lurking in the Top 500 List.
*Consider the emergence of the BlueGene/L processor, nonexistent in June 2003, but now dominant in the top two spots and holding a total of sixteen entries in the Top 500 just two years later. Does this shift portend a trend in the Top 500 or just a significant niche?
*The number of scalar machines went from 131 at the list’s inception to 482 today. During the same period, vectors fell from a lofty 334 to just 18 systems today.
*If we look at raw processing power based on the best LINPACK numbers achieved over the twelve-year period since the beginning of the Top 500, the HPL has risen from 59.7 in 1993 to 136,800, measured in gigaflops, or billions of floating point operations per second. This growth represents a lofty 90 percent CAGR (where a compound annual growth rate of 100 percent represents a doubling of speed every year).
Similar nuggets of truth about computing trends abound through careful analysis of the Top 500.

The Future

So what can the Top 500 tell us about the future of HPC?
Entering the era of very massively parallel computing, the numbers of lower-speed, lower-power consumption processors will increase a magniture or greater, as in IBM’s BlueGene/L.
Massive amounts of nodes demonstrate the flexibility and capability of both the open source and clustering models to harness previously untenable numbers of processors.
Interestingly enough, this new model of massive numbers of nodes (MNON) tends to hold stready to the expected CAGR of overall system performance. The net effect is that the MNON model may simply be the latest vehicle used to fulfill the overall annual doubling of system speed.
Whether the MNON model becomes the vehicle of the future remains to be seen. However, open source, clustering, and an annual doubling of system speed are constants in the forseeable future.

Final Thoughts

No Top 500? Ludicrous! Thankfully, The List is alive and well and living up to its original intention of showing long-term computing trends.
Through simple trend analysis, one can observe the rise of the once lowly, commodity Intel processor to the position of dominant processor in the Top 500. The cluster model, the leading architecture type in the Top 500, was designed to harness the power of commodity processors, and now touches virtually every segment of computing. SMPs are out of vogue for HPC, and MPPs are in decline.
May The List continue to thrive and yield its truths 1,000 computer generations and 10,000 processors from now.

Richard Ferri is a senior programmer in IBM’s Linux Technology Center, where he helps customers convert to Linux. He has worked on proprietary and open source clustering solutions since 1994, and was one of the original contributors to the popular open source clustering solution, OSCAR. He lives in upstate New York with his wife Pat, his three sons, and three dogs who think they’re people.

Comments are closed.