Does Cloud Computing have a play in HPC? Join us for a pragmtic view of "the cloud" and how it may expand the horizons of HPC.
Before we get started, I would be remiss, if I did not remark that terrestrial clouds are made up mainly of ice crystals formed from vapor. One might posit that clouds are formed when the air temperature is sufficiently warm to allow the vapor to rise, where it rapidly cools and crystallizes. Don’t take it from me, check with NASA for the final word “on hot air and clouds.”
Don’t read anything into this though. I am just pointing out some irony here, that terrestrial clouds are formed from vapor and (at least) warm to hot air. Cough, Cough.
A Cloudy Definition
There is this buzzword term running around called “Cloud Computing,” which like many other buzzword terms, have more than one definition. Maybe “buzzword” is a bit strong. In any case, let’s delve into the Cloud, and you decide if this is fad or disruptive future technology.
Getting a definition of Cloud Computing can be an interesting exercise. Ask ten technologists and you will likely get eleven answers. And, that is a problem. Defining something can often help shape a discussion, though in this case, the multiple definitions might confuse the discussion.
In this case, a pragmatic approach may work best. One might then ask, what problem or set of problems does Cloud Computing solve, and how does Cloud Computing solve it?
Before we answer that question a bit of background my be helpful. HPC users find the SSE registers in Intel architecture chips useful as a way to increase the number of computing cycles completed per unit time, effectively reducing the cost per cycle. Multi-core is an extension of this, providing many general purpose processing cores per socket. Clustered systems provided large multiplicative factors of clock cycles per unit time for certain applications. Accelerators seek to increase computing efficiency per cycle or provide local massive cycle parallelism within a single machine. All of these techniques are attempts to throw more processing power, usually locally, at the problem of computing.
For High Performance Computing (HPC) users, these are all familiar techniques. And, they come with some known costs. In the case of SSE, there is the cost you pay to pack and unpack the data. In the case of multicore you have another hierarchy with which software must address. In the case of clusters, you have to add in significant communications latency and lower bandwidth. In the case of accelerators, you have to deal with the per cycle costs as well as the development and support costs. All of these techniques are about reducing the cost per cycle, by dramatically increasing the number of available cycles for applications.
In my experience, the problem Cloud Computing is seeking to solve is to reduce the time, complexity, and cost to stand up and scale up (or down) applications. That is, it is not about multiplying processing cycles in a computing intensive application, but enabling that application to be scaled up or down in terms of utilization and demand. Reduce the marginal cost of adding (or removing) users and applications.
Not everyone will agree with this definition. Call it the 12Th definition of Cloud Computing. What we find is that most of the variations in Cloud Computing come in terms of how the various aspects are delivered. What is interesting is that there is a wide variety of methods to deliver an application to users. Some of the high performance computing methods are useful in these contexts as you can leverage the ability to deliver what the application needs, instead of a fixed number of cycles per unit time. These methods allow significant flexibility in application delivery.
A Computer Is Just a Door-stop Without Applications
Think about it. The pragmatic definition I gave for Cloud Computing begs numerous questions, on the implementation and usage side. Unfortunately, it is a magnet for companies who feel a need to re-brand their offerings as being “in the Cloud” in order to stay current. This is what gives Cloud Computing a fundamentally “fad-like” sense to it. This trend reminds me of what happened in HPC when clusters started gain real market share. At first these systems were dismissed by the major vendors as not meaningful solutions (clusters now dominate HPC). Realizing that they were left behind by the upstarts, some of the major vendors started talking about “Grids”, and that term was largely co-opted by marketeers as the logical progression of clustering into their offerings. Allowed them to say “hey, we were working on it all along.” Unfortunately, co-opting and confusing the term “grid” from its original intentions only managed to confuse purchasers of clusters who were offered “grids”.
My belabored point is that expect some re-branding as a way to hitch existing products to the computing cloud. Before we talk about using clouds, let’s take a short tour of what is currently available. Please bear with me as we go through these offerings, and note that people may disagree with small or large particulars of this discussion. That’s fine, as, if you will pardon the pun, Cloud Computing is somewhat nebulous.
A Tour Of Cloud Computing
Cloud computing has a producer side and a consumer side. The producer side are the platforms and systems upon which you set up your software tools. The consumer side is, of course, your user base. When combined with virtualization using standard tools such as VMware, you can largely make the producer side appear pretty close to exactly the way you want, within specific limits (of the virtualization software). For example, while I am sure there would be some users out there who might like this, tying together bunches of virtualized containers will not let you create a monster 128 socket 512 core SMP with 1 TB of memory. Or, at least today you can’t do this. In 2008, we can’t create a virtual Cray supercomputer in the Cloud.
What we can do is create our platform on someone else’s platform. This is the Platform as a Service (PaaS) model. You don’t pay for acquiring the hardware, just “renting” it. You pay only for what you use. So if you want 20 small 1 GB ram, single core machines with a small local disk to act as web-front-ends for your system, by all means, create this environment. And, this is where it gets interesting on the producer side, and curiously where things like Linux and FOSS come in.
You can create this using a variety of tools on a variety of platforms. As your needs scale up you can instantiate an additional box (virtually or physically). Remember, the cost of standing up that N+1th box is one of the things that Cloud Computing is seeking to reduce. Imagine that you need M (some large number) of additional boxes. Imagine you have a license cost to pay per instance, lets call that L(i). And you have a license cost to pay per average number of connections to this box, lets call this L(c). Your cost to add 1 additional box is B. So your cost model of adding additional capacity looks like:
M*(B + L(i) + L(c))
For a FOSS OS, L(i) is zero. Same with L(c), at least for the OS, and the FOSS tools.