dcsimg

Smashing (and) the HPL Benchmark

As SC10 fast approaches, you may have little incentive to read this column, which is why I include my latest personal drama.

Once a year I write a column the week before the big HPC show. This year SC10 (a.k.a. the show) will be in New Orleans. I assume very few people will actually read this installment because they are either in the Big Easy, on their way, or running around frantically trying to get ready for their week on Bourbon Street. I fall into the last category. I have have been preparing my Limulus Machine, writing a white paper, helping a few clients, and smashing my thumb with a step ladder.

That last item may take some explaining. Earlier this week we had a wind storm here in the Northeast. When my daughter came home from school and said, “Dad you may want to come out and look at this.” It turns out a bunch siding blew off the house. I grabbed a couple of ladders and went to investigate. The end of the story is that while on my garage roof, I managed to collapse a heavy step ladder with my thumb in the middle — ouch. So in addition to my large pile of things to get done this week, I now have a smashed, swollen, and throbbing thumb that makes everything I do a bit more interesting. I will persevere, however, SC is upon us.

Moving on to things HPC: Starting this week, there will be a flood of announcements from virtually all the vendors. I mean, what is trade show without a press release, right! My challenge has always been to look for the interesting stuff that is off the radar and not just an extension of the current trends. Of course there will the Top500 list with the anticipated win by Tianhe-1. I included this particular link because it had a picture that has nothing to do with the fastest machine in the world. The picture was from the flash mob cluster experiment held at the University of San Francisco in 2004. Conceived in the beginning of the “dumb things you can do with the Internet” era, the idea was a classic example of a really trendy idea that has no practical use — if it worked at all.

The goal was simple, like a flash mob, “Let’s have everyone bring their laptop to the gym and make a big cluster. If we get enough laptops maybe we can get on the Top500! What can go wrong?” Plenty. There is the issue of the network infrastructure. It is a safe assumption that almost everyone has a laptop they can donate to the cause, but who has an extra 256 port Ethernet switch and several miles of cable laying around? All my personal 256 port switches and spools of cable were busy running my home network. So the flash (i.e. instantaneous) interconnect aspects of the cluster were set up days in advance with support from vendors. Once all the network infrastructure was prepared, the spontaneous mob could bring their laptop to the gym and become famous. Wait, don’t laptops have different kinds of processors with hibernation modes? And how reliable is a random laptop at running full out for a few hours? Good questions.

The thing about the HPL benchmark is it assumes all the computing elements provide the same performance and will run for the entire benchmark. If one processor (notebook) is running slow, then the rest must wait. If one notebook dies (hibernates) then the benchmark dies. Hard and obvious lessons to be leaned with piles of other people hardware on a Saturday afternoon. Having been a witness to the whole event, the only unspectacular HPL result they managed to get was from the large number of similar desktop machines donated by a local hardware vendor. So much for the big idea. In a way, it was great demonstration of why an HPC cluster is much more than a pile of hardware. Furthermore it was a testament as to why extending obvious trends is not always successful. As an aside, I often wondered why such an experiment was even necessary, in 2004 there were “Internet flash mobs” developing that spawned things like Folding At Home. A practical problem, unlike the synthetic HPL benchmark, that can be broken in to small parts and run asynchronously.

I should add that I like unconventional ideas. In the case of the flash cluster, I thought a little small scale batting practice would have helped before they decided swing (and miss) at the first pitch while the world looked on. It does make for a great photograph in any case.

Flash HPC experiments not withstanding, cluster HPC is all about disruptive technology, which is why the SC10 Key Note Address by Clayton M. Christensen (author of the The Innovator’s Dilemma) should be very interesting.

Last, but not least, a final mention of the SC10 Beowulf Bash on Monday November 15th. The free collectors edition invitation is also available. Hope to see you there. Directions and times are on the invitation. Spread the word, we are on the boat. Another shout out to our sponsors who made it all happen: Penguin Computing, AMD, Adaptive Computing, Aeon Computing, ClusterMonkey.net, Kove (previously Econnectix), insideHPC, Intersect360 Research, Numascale, QLogic, SICORP, SuperMicro, Terascala, Versant, and Xand Marketing.

I’ll close with my new “rule of thumb,” ice and Tylenol. See you next week!

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62