How big of a cluster can you build? With a little math and the speed of light you can find out.
Last week I moderated a webinar entitled Optimizing Performance for HPC: Part 2 – Interconnect with InfiniBand. It was a great presentation with a lot of practical information and good questions. If you missed it, it will be available for a few months, so you still have a chance to check it out. As part of the webinar, Vallard Benincosa of IBM, mentioned that the speed of light was a becoming an issue in network design. In engineering terms, that is refered to as a hard limit.
I started to think about this limit and how it would effect the size of clusters. I did some back of the envelope math to get an estimate of how c (the speed of light) will limit cluster size. I want to preface this discussion with a disclaimer that I thought about this for all of 20 minutes. I welcome variations or refinements on my ciphering.
Let’s first consider the speed of light (SOL) in a fiber cable. The number provided by the Qlogic crew for the webinar was 5 ηs (nanoseconds) to travel one meter in a fiber cable (It takes light 3.3 ηs to travel one meter in a vacuum). How can we translate that into a cluster diameter? Latency is measured in seconds and the SOL is measured in meters per second. Here is one way. First we have to define some terms:
LT is the total end to end latency
Lnode is the latency of the node (getting the data on/off the wire)
Lhop is the latency of the switching chips
Nswitch is the number of switch chips.
Lcable is latency of the cable, which is a function of length
A formula may be written for the total latency as follows;
||LT = (Lnode + Lswitch*Nhop + Lcable)
If we take equation 1 and solve for Lcable, then divide the right hand side by 5 meters/ηs we get what I call the core-diameter:
||LT – (Lnode + Lswitch*Nhop)
The core-diameter is the maximum diameter of a cluster in meters. Let’s use some simple numbers. Suppose I need 2 μs (microseconds) latency for my application to run well (this is LT) and my nodes contribute 1 μs and I use a total of 6 switch chips with a latency of 140 ηs (nanoseconds). I get a core diameter of 32 meters. This diameter translates to a sphere of 17 thousand cubic meters. If we take an average 1U server and assume it’s volume is 0.011 cubic meters, then we could fit about 1.6 million servers in our core diameter. In practical terms, the the real amount is probably half allowing for human access, cooling, racks etc. So we are at about 780 thousand servers. If we assume 8 cores per server, then we come to a grand total of 6.2 million cores. If we run the numbers with LT of 3 μs the number explodes to almost 600 million servers and we can see why cable distance has not been an issue.
A few things about my analysis. Obviously my numbers could be refined a bit, but as a first pass they seem to work. Scaling an applications to such large numbers may be a bigger challenge than the SOL, but it does put some limits on just how big a cluster can become. Actually, it is a bit more limited than my simple analysis. There is a push-pull effect. Better scalability comes from lower latency, which decreases the diameter. Thus, in order to increase (push) the number of cores, I can use, I need to reduce the latency which due to the SOL reduces the diameter (pull) or actual number of cores I can use. Perhaps some enterprising student could come up with a model that captured this effect. I should also mention that refining the assumptions can change the actual numbers, but the push-pull effect due to the SOL is the same.
I have run out of room on the back of my envelope as I don’t think this analysis can be pushed much farther without some refinements. I’ll leave it as a exercise to the reader to continue the analysis. I will coin the term core-diameter, however, as it sounds cool.
Moving on, I wanted to mention another type of progression in which parallel computers play a roll. There are those that believe we are in a period of accelerating technological change. If you look at the Top500 as one example, in 1993 the the top machine recorded 60 GFLOPS, this past June we hit 1.1 PFLOPS, that is 5 orders of magnitude in 16 years. There are those that are interested in discussing the effect and/or consequences of this types of progress. The main idea is that we are rushing toward a singularity of sorts that will result in a super-intelligence. Advances in Artificial Intelligence, Nano Technology, and Biology may be pushing us closer to a potential singularity. And, behind all of the these technologies, lies an HPC cluster. That would be where we, the cluster geeks, fit into the picture. Each year there is a Singularity Summit where these issues are discussed. This year it is in New York City. I think I’ll head over and see what the visionaries have to say — while there is still time.