So you think you're a cluster vendor? You might want to watch how you toss that term around Mr. Eadline.
This week I decided to take a break from my planned discussion of declarative languages. Mind you, this is an unplanned break as I’m at the point in my discussions where I’m about to reveal the answer to all your problems. I can’t do that this this week because I’ve been overcome with a rant. Much like that wave you see out in the ocean slowly building as it moves closer. There is simply no stopping it.
Usually my rants have a trigger. I’ll read something or hear a comment about an important issue. Sometimes I agree with the the opinion, sometimes not, often it is that bump on the waters surface. The wave has begun.
Recently on the Beowulf mailing list there was a discussion about vendors and cluster procurement. (The thread may still be active.) The upshot of the initial comments is that large vendors sometimes deliver servers in a state that is inconvenient for clustering. And, they seem clueless to the customers needs. From my experience, issues such as BIOS configuration, PXE booting, disk-full, disk-empty, disk-less systems are often a problem for cluster customers.
Of course the big vendors want to sell to a majority of the market. Good economics they say. Fine, then let’s be clear about what it is you are selling — racks of servers. A rack of servers is not the same thing as an HPC cluster. If you are new to the HPC thing, go back and read that again. There is big difference between buying an “HPC Cluster” and “racks of servers.” Many customers and vendors think they are the same thing because they look the same. A perfectly understandable conclusion. But, now you know they are different. Consider yourself enlightened. In general,the creation of an HPC cluster usually works is in one of these three ways:
- Buy rack of servers, invest the time (cost) to turn it into a cluster using open software packages or cluster distributions. Possibly the least expensive way to go.
- Buy a rack of servers and buy a cluster software stack and set up the system yourself. Usually not as cheap as option 1, but usually less expensive than option 3 below and there should be someone to help you get things working.
- >Buy a turn-key system and start working right away. This option is usually the most expensive in terms of initial cost, but you are guaranteed a working system.
Option 2 and 3 usually have some kind of software support option. The thing to realize is that there is an additional cost above the hardware (racks of servers). Not until the racks of servers is working together as system can you really call it a cluster. Also note that option 1 presents the largest possible variation in cost (a risker bet as it were). Should you buy that truck load of hardware only to find out that there is an “issue” with configuration you may be performing a lot of extra work (even at the graduate/student wage rates). In addition, you have assumed the responsibility for maintaining and upgrading the software. For the seasoned cluster administrator, option 2 or 3 may be the lowest cost when you figure out the total cost of ownership over 3-4 years (and the amount of headaches).
The above analysis is a bit simplified as your situation may vary. It still makes the point that there is a “cluster cost” that must be paid if racks of clusters are to be called a cluster. Understanding this distinction has got my ranting wave rolling thus far. The market should get the difference by now. In the past I recall certain vendors bragging about how many “clusters” they sell a month. I would listen, then sigh, and explain that selling a rack of servers is not a cluster, the cake still needs icing. Viewed only in terms of hardware, the ticket to the HPC ball got extremely cheap. Even the rackem-stackem-fly-by-night-want-to-be-HPC vendor could get in the door. And yet, most of these systems are not operational clusters.
The fact that anybody with a screwdriver can call themselves an HPC vendor is mildly disconcerting. What really brings my rant wave crashing down are the vendors, big and small, that have no clue as to what opened the door for them in the first place.
If you are a customer, next time you are thinking of making a hardware purchase, put this simple question on you RFP, “What has your company done to support the open HPC cluster community/market?” I invite you to take this response very seriously because your purchasing power is stronger than you think. Throwing business at those companies who have a seat on the HPC cluster clue-train will help ensure that much of the excellent “free” stuff that is used to build your cluster, stays free. And, the requisite phrase, “free as in speech” is particularly important here. Of course there is free (openly available) software and some vendors do contribute in a variety of ways. The other “free” things are the discussions (free as in speech) taking place over the Internet, at conferences, and most importantly over a glass of “free beer” at those hospitality parities. Vendor that support the community deserve your dollars. And there is usually a bonus, these type of vendors deliver HPC clusters.
If you are vendor who has not joined the community, take heed, I’m not in particularly nice mood today. So you want your slice of the ever-growing HPC cluster market. Fine, how do you think you got to the feeding trough? Here’s hint, it is not so much what’s in your boxes, the color of your cables, or your keen business sense. In my opinion, you got here on the back of all those who built a market/community of open and freely available software, information, and conversations. You would do well to embrace the whole co-operation thing. Sharing helps make the pie bigger. The ways to get involved are numerous. Support (time or money) a project that helps your customers, contribute to a mailing list, share your experiences and best practices. Think of it as a focus group and listen. You will get information about the HPC market that you cannot buy. You will learn more about this market than you ever thought possible. And, then, maybe then you learn how to sell clusters instead of racks of servers. One word of advice, don’t try and turn every encounter with the HPC community in to a sales opportunity, it will absolutely not work.
In closing, when I started this rant I told myself, I am not going to single out any vendors as I don’t like kicking down doors and taking names. You should be aware, however, that there are Linux Penguin companies that are highly skilled and often help streamline the understanding of things like scalable informatics. Those kind of vendors know the difference between a rack of servers and cluster because they live by the open source credo — give a little and get a lot. In my opinion, they give a lot and get very little, but that is another wave forming in the ocean
Douglas Eadline is the Senior HPC Editor for Linux Magazine.