Clusters are getting larger, multi-core adoption plods along, and other findings from our recent HPC micropolls.
While looking over the recent micro-polls from Todays HPC Clusters, I did not see any big surprises. Perhaps I should qualify that statement. When I get the chance I like to ask people about their cluster usage. I also read many of the IDC reports and do my own small-scale polling when I can. The results of the recent polls are what I expected, but I think times are a-changing. (See The Future of High Performance Computing.)
When every I talk about small Web-based polls I always provide a disclaimer that these are not carried out in a rigorous fashion, and should be considered a “ballpark” estimate of the HPC market. Fortunately, most polls had more than 100 respondents which helps my comfort level. Still, simple Web polls can be gamed to push one’s own agenda. Also, keep in mind, the numbers of respondents for each poll were different and the numbers I am using were collected on August 27 of this year.
Looking at the first question, 72% of the 117 respondents said they have an HPC cluster at their location. What I find interesting is that the remainder (28%) do not, but are looking at the site in any case. Clusters, it seems, are on peoples minds.
The next question asked about quad-core migration. The majority (84% of 92 respondents) said 10% or less of their servers have moved to quad-core. This is not so surprising, as quad-core CPUs were released less than a year ago and upgrade cycles are usually measured in terms of years. (See Multi-core Malaise and Forty Cores: Hands-on with the Tyan Personal Supercomputer.) Interestingly, 10% of the respondents said they have migrated 75% of their servers to quad-core.
Moving on, we also asked about processor type. AMD seemed to lead in this category capturing 56% of the 104 respondents. Coming in second was Intel with 36%, with IBM’s Power (4%) and Sun’s UltraSPARC (5%) rounded out the other choices. This result is not a huge surprise and reflects two things.
First, commodity x86 processors are the choice of most HPC cluster users. Second, the Intel/AMD competition is where most organizations are investing their money. Over the last year, Intel has upped the completion with both the Woodcrest (dual-core) and Clovertown (quad-core) platforms and now the recent AMD is stirring the pot with the new quad-core Barcelona. How all these new offerings shake out in terms of HPC performance is anyones guess. The bottom line, we get more cores. (See The Gig is Up: The Future of Commodity Processors.)
Turning to networking architecture, I am starting to see a decrease in Gigabit Ethernet (GigE at 54%) and a definite rise in Infiniband (IB at 25%). I suspect that this shift has much to do with the interconnect needs of multi-core as most new dual socket nodes have at least four cores (two dual-core processors). The more cores on a node, the more communication that a single interconnect must handle. GigE has always been a low-cost favorite of cluster users, but multi-core needs a bigger pipe and IB seems to be the choice among users. Most people I have talked to are talking of moving to 10-GigE, but cost seems to be an issue for now (as it was when GigE first came on the market). (See The Network IS the Cluster: Infiniband and Ethernet Network Fabric Solutions for HPC and The Wide Area Cluster.)
Finally, we come to the most interesting result. In the past when I or anyone else ask about cluster size, the results are always very similar. A high number of clusters have 64 nodes or less, almost no clusters have between 64 and 256 nodes, then above 256 the number increases.
This result has always intrigued me. I suspect it has to do with two issues. First, it has been noted in the past by myself and IDC that most applications do not scale over 32 processors (approximately). Therefore, most users don’t need a large cluster unless they require a lot of capacity (i.e. They keep the cluster busy running many variations of their code). In addition, what I have seen first-hand is clusters are often sized by the maximum number of processors needed for a specific job, but a large portion of the jobs running on the cluster are often small (four to eight processors) or even serial jobs requiring a single processor. It will be interesting to see what effect multi-core has on this trend! Will users cut down on the number of nodes or will they buy about the same number of nodes and increase capacity?
Another issue with this trend is probably has to do with infrastructure. Most server rooms can accommodate a few racks of cluster nodes, but installing a large 256+ node system becomes a power and space issue for many organizations. Again, multi-core may change this as well, as more cores can now fit into more space.
I’d like to address one more data point that’s not in the survey. I have recently seen a 1U node with two separate dual socket Intel motherboards. Each motherboard can hold two quad processors and has on-board Infiniband. To put it another way: That’s 16 cores in a single 1U server chassis, or close to 256 cores in a single cabinet with on-board Infiniband!
When we ask these questions again, I’m sure the results with be different, but I haven’t the foggiest idea in which way. In the mean time, head on over to Todays HPC Clusters and look for our new polls on HPC challenges. And, by the way, everyone who registers and takes the polls gets a copy of the results. At a minimum you can check my math.