Cluster Geeks, Unite!

Systems with multiple cores are just like clusters. But there’s good news, too…
Welcome to “Cluster Rant,” your monthly dose of all things cluster. Many of you may recognize me from ClusterWorld Magazine. For those that don’t, ClusterWorld was devoted to high-performance computing clusters (HPC clusters). ClusterWorld provided guidance and hands-on instruction for many many exciting topics, some of which have been mentioned in Forrest Hoffman’s “Extreme Linux” column in Linux Magazine.
Before diving into the rest of the column, I should address two questions that must be on reader’s minds. “What happened to ClusterWorld?” and “What will be the focus of this column?”
ClusterWorld has been merged in to Linux Magazine because it made economic sense. ClusterWorld had a great run in its nineteen issues, but the market realities made it difficult to support a standalone magazine. And, since Linux is an integral part of clusters, it made sense to merge the two together in a single magazine. Of course, you can use other operating systems for clusters, but in reality, the three people who do so will have to fend for themselves. Make no mistake, Linux clustering is alive and well.
Which, by the way is a great segue to the second question, “What will be the focus of this column?”
Of course, the column will be about HPC Linux clusters and all the wonderful things you can do with them. Furthermore, as the name of the column implies, I’ll rant a bit about different aspects of HPC clusters with the hope that after several good brow beatings, you may get just as ecxited as I do about the possibilities of using more than one CPU for your computing needs. Sure, many of you don’t yet have the need to determine the shape that your newly-discovered genome protein took just a few seconds after the Big Bang, but “clustering methods” are the way things will be done in the future. Period.
SUBHEAD Clustering Your Way to Greatness
Before I try to convince you that clustering is the wave of the future, a little history is in order.
The cluster approach to HPC emerged largely due to the economics of building CPUs. Designing and fabricating a CPU is quite expensive, which means, you need to sell a pile of them to stay in business. Supercomputers are unfortunately not sold in piles, so the cost to build specialized CPUs has become prohibitive. On the other hand, the components used in clusters are sold in 7- and 8-digit quantities.
The first Beowulf clusters took advantage of this price-to-performance capability and used a large number of low-cost, commodity processors. These systems were remarkably faster, better, and cheaper than many supercomputing systems of the day.
In a similar vein, modern CPUs are getting prohibitively difficult to manufacture at high clock speeds. Rather than just race to beat the clock, manufacturers have introduced dual- and quad-cores. As in the HPC world, using multiple CPUs is more economical than trying to make a great, colossal CPU.
And, if you bear with me for a moment, you’ll understand that multi-core CPUs are almost just like clusters. At first blush, this statement sounds ridiculous, as there are lots of hardware differences, but to programmers and users, clusters and multi-core processors share exactly the same issues.
Like a cluster, programming multi-core CPUs is a real pain in the asymptote. There is no magic bullet to make your program run in a distributed fashion on a multi-core CPU. The entire computing industry is going to have to start thinking like a card-carrying cluster geek. There are no shortcuts, there are no compiler options, and there are no easy solutions on the horizon. The “how do we economically program multiple CPUs” question has been dodged in the HPC world for over twenty years. The time has come to take a hard look at how we do a number of things with computers. Fortunately, cluster geeks have been thinking about these issues and have some things to offer.
(Please don’t get all preachy about OpenMP, threads and the like, because I’ve been told by reliable sources that message passing codes work better on some dual-core systems than do OpenMP versions of the same code. Don’t worry, we will be taking a look at this issue in the future.)
The bottom line is that moving and sharing data now becomes an issue beyond the spec sheet. A single processor with a single hard drive, network interface, and memory subsystem can be hard enough to optimize. Sharing these resource between CPUs just gave us a lot more knobs to turn.
Yet all is not doom and gloom. As I mentioned, the cluster and HPC communities have built a foundation to solve many of the issues facing multi-core approaches. Indeed, the perfect storm of Linux, Open Source, the Internet, and commodity hardware, have made for some great advancements thus far.

Benchmarking Your Way to Success

Those that know me can tell you that in addition to my enterprise channel value enhancement approach to clustering, I have, a very pragmatic side. Basically, I’m a numbers kind of guy. Words like “faster,” “hyper,” and even the proverbial phrase “blows the doors off” have little meaning to me. I like to run tests to see if the latest and greatest is really all that and worth the money. Of course, this all depends on what you really want to do with your cluster.
Any good cluster jock will tell your that benchmarks are an essential way of life. If you’ve ever lurked on the Beowulf Mailing List, you’ll know that the answer to almost all cluster questions is “It all depends on your application.” While this answer seems rather obvious, it holds a great truth about cluster technology: like Linux, clusters are about choice. If I need a lot of memory on my cluster nodes, but no hard drivers, by golly, I can buy exactly that hardware and not spend a dime more than I have to. Such is the way in the cluster universe.
That’s the good news.
The bad news is that it may not be totally obvious exactly what you do need for your given set of applications. What works for the guy down the hall may not work for you. How then do you know what you need? Benchmarks are the time tested method of actually running programs and seeing if the new hyper-turbo-quad-pumped mode actually makes a difference. In the process you may actually find what you need (or don’t need) and at a minimum be able to to make a highly-educated guess.
Similar issues will come up with multi-core systems. Careful benchmarking will maximize that all important “bang for your buck.”

Forward Looking Statements

Among all the possible cluster items to discuss, there are, however, two projects to which I plan to devote some attention; both are efforts that started while I was developing ClusterWorld Magazine and I have a strong desire to continue them as well. The first is the ClusterWorld Benchmarking Project (CWBP) and the second is the Cluster Agenda.
I’ll have more on these issues next time. Until then, you can check the sidebar “Resources” for more information.

Douglas Eadline can be reached at class="emailaddress">

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/ on line 62