Stocking the Toolbox: Tools No Cluster Admin Should Be Without
Setting up a cluster can be trying enough, and maintaining it can be even more difficult. The sheer number of nodes involved in a large cluster can be daunting, as can users’ expectations for quality of service. To make life easier, Troy Baer provides a tour of tools that every cluster admin should know about.
Congratulations, you have your shiny brand new cluster installed! You’ve got your nodes and interconnect up, your file systems mounted, your resource manager running jobs, and your third-party applications ready. Users are chomping at the bit to get on the machine. Now comes the hard part: keeping the whole thing running smoothly.
Setting up a cluster can be trying enough, and maintaining it can be even more difficult. The sheer number of nodes involved in a large cluster can be daunting, as can users’ expectations for quality of service. However, cluster admins have a wealth of tools available to make life easier.
Remote Access
Unless you enjoy hanging out in your machine room with a keyboard and monitor on a crash cart, being able to control your cluster nodes remotely is critical to keeping your sanity as a cluster admin.
Ideally, you should be able to run commands on your nodes, connect to their consoles, and even power-cycle them, all from the comfort of your own office. A little time spent configuring these facilities early on in your cluster’s life cycle can save you considerable grief down the road.
Distributed Shell
Obviously, one of the most important things in administering a cluster is the ability to run commands on many or all of the nodes in the cluster at once, which is usually called a distributed shell. You’ll find a huge number of distributed shell projects out there. They tend to overlap somewhat, but each one has its own distinctive…
Please log in to view this content.
Not Yet a Member?
Register with LinuxMagazine.com and get free access to the entire archive, including: