dcsimg

I Have (HPC) Issues

There I admit it. There are certain things that send me into long rants when it comes to High Performance Computing (HPC). (We'll skip the non-HPC issues for now). I'll bet you have issues as well. Those things that just bug you about the state of HPC clusters. Admit it, you do. There, don't you feel better?

There I admit it. There are certain things that send me into long rants when it comes to High Performance Computing (HPC). (We’ll skip the non-HPC issues for now). I’ll bet you have issues as well. Those things that just bug you about the state of HPC clusters. Admit it, you do. There, don’t you feel better?

In a politically correct sense, I should call my issues challenges. A more positive spin is always helpful. From a marketing perspective, issues are often called pain points. Nothing positive there. Keep telling yourself price performance has never been this good.

So what are my issues? In the past many of my HPC issues were tolerable and I was hopeful with continued effort things would get better. Then multi-core arrives and throws everything into the blender. So no I don’t just have issues, I have multi-issues.

My list is starts with people. We need more people that understand this stuff. We need more domain experts (end users) that can adapt the established methods of cluster HPC to their problems areas. For instance, more cores in a single box may be a big win for some, while others running legacy applications will find the applications seemingly blind to the extra cores. To the domain expert, more cores should mean faster applications. In some cases it may actually mean slower applications.

In addition to domain experts, we need people that can manage and design clusters effectively. The choice to obtain a cluster should be no more difficult than any other computer system. It is no longer enough to stack up nodes with some Ethernet, as a slow interconnect approach may leave cores starved for bandwidth.

And finally, we need people that can write programs for a multi-core cluster environment. In terms of challenges, this is the Mount Everest of issues. In the past, writing and optimizing parallel codes was not an easy task. As a mater of fact, it was and still is hard. A typical MPI approach to computing was based on the assumption; “I have a bunch of processors with their own private memory connected together with some type of network.”

The new assumption goes something like this; “I have a collection of shared memory SMP (multi-core) islands connected together with some kind of network.” In essence, there are now two communications paths: local through memory, and distant through the network. Programming and optimizing for this new model is now much harder. There are hybrid approaches that employ threads or OpenMP on the SMP nodes and MPI between nodes. This approach requires two different conceptual models in the same application. I consider this somewhat painful, but then I do have issues.

The programming issue is complicated by the fact that no one knows the best way to program for this new multi-core cluster paradigm. The older more traditional MPI approach was at least a known method and somewhat mature albeit arduous methodology. We don’t have such a luxury.

In summary my big issue is people. We need people that understand how to use this new HPC Lego that the market is giving us. (For those that are chronologically gifted, recall tinker toys or erector sets). In addition, we need to figure out how to program these things in a cost-effective way. Indeed, not just the HPC crowd, but everybody needs to grasp this, as the parallel approach to computing is here to stay.

My issues are well, my issues. Unfortunately I have more. For instance, concerns about power and cooling, cluster management, parallel I/O, interconnects, co-processors (GP-GPU’s and FPGA’s, etc.), virtualization, grid, etc. are all on the table and somewhere on my list as well.

Now it’s your turn. What are you issues? You can tell me and everyone else about your issues by checking out the new polls on the Todays HPC Clusters site. Your response is anonymous and you may find you are not alone in your pain, sorry challenges. And, while you are at it, take the Linux Magazine Data Center Infrastructure Survey

Comments on "I Have (HPC) Issues"

clusterman

I have issues with vendors. One of the biggest challenges we face is knowing when and how to update the firmware on all the cluster components. It seems that vendors don’t seem to understand how to do this either. For example, you call in a hardware issue, the help desk tells you that you need to update firmware. We tell them we have 500 nodes all running the same revision and they are not having the issue. The help desk says they should all be updated. So there in lies the next problem. The helpdesk tells you to download this ISO and boot your server off it. Hello! We have 500 of these. We actually had to do this manually one time because the firmware on our hard drives was bad and the only way to update them was to boot a floppy and perform a dos level update. Vendors need to develop automated tools for updating clusters with one command. The grid is designed so that you can submit in one place and a job goes off and does work at all the other locations. Why don’t the vendors get this? They want to sell you a cluster but they want to perform break/fix on a per system level. Very frustrating. We actually had the vendor professional services team out here to help us build our last cluster. We could not believe our eyes when the Engineer proceeded to to one system at time. It was very painful. We’ve since home grown a lot of our own tools but it is just ridiculous that the vendors aren’t providing this functionality. Does anyone else have this issue? Maybe its just the vendor we are working with?

Reply
kavalerov

A lot of it is self-inflicted. The DYI approach is wrong. Why not pay a vendor to do the low-level work for you? Can not afford? that’s the problem that needs to be solved.

Example: firmware updates on live system. Easy to do on IBM pSeries, always has been, and least since AIX 4.3.3. Can also do it remotely, on any number of nodes. No need to reboot, or even degrade your run level. It does it for your service processor, your hard drives, storage adapters, anything.

Reply
nospamou

To Matt Domsch from Dell…I had the exact same issue as in the article. I had to upgrade the drive firmware on 500 servers, and they were Dell. I contacted Dell technical support and they didn’t have any work around for automating this. Something for Dell to work on as everything else I’ve had to upgrade has worked like a charm. Maybe when the VMWare hypervisor is built into the motherboard system upgrades will become alot easier and won’t require a reboot. We’ll see.

Reply
tuccillo

I am not sure what you mean when you say “Programming and optimizing for this new model is now much harder. ” (refering to SMP nodes in a cluster). I have been using SMP nodes in clusters for 10 years and never had any issues. Typically, I assign as many MPI tasks to a node as cores. This is really a function performed by a batch scheduler. I also started developing the Hybrid approach (MPI/OpenMP) many years ago for those instances where combining MPI and OpenMP provided a performance advantage over pure MPI (using the same number of core). I have seen very few examples, however, where this is true. Most of the time, however, I assign as many MPI tasks to a node as core. The message passing model doesnt care where the processes are located from a functionality point of view. In a very few instances there was a performance impact but arbitrary assignment of MPI tasks to arbitrary nodes is a feature of some batch schedulers. For the vast majority of MPI programs I have seen, the number of cores on a node is just not an issue.

Reply
jmcculloch

Find another vendor. We employ a PXE-bootable DOS image to apply firmware updates and CMOS configuration changes on a large scale.

Another option is to use a package like IBM Director with Remote Deployment Manager add-on. It works with most server vendors.

Furthermore, if your cluster management package supports parallel shell, often you can copy the firmware update package locally to each system and execute at the command prompt. This usually requires rebooting the cluster to apply the update.

Regards,
John McCulloch
PCPC Direct Ltd.
http://www.pcpcdirect.com

Reply

obviously like your web site however you have to take a look at the spelling on quite a few of your posts. A number of them are rife with spelling issues and I find it very bothersome to tell the reality however I’ll certainly come back again.

Reply

I’m no longer certain where you are getting your information, but great topic. I must spend some time learning much more or working out more. Thank you for excellent information I was on the lookout for this information for my mission.

Reply

I’d have to examine with you here. Which is not one thing I usually do! I take pleasure in reading a post that may make folks think. Additionally, thanks for permitting me to comment!

Reply

Thanks for the auspicious writeup. It in fact was a amusement account it. Glance complex to far added agreeable from you! By the way, how can we communicate?

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>