dcsimg

HPC Masters: Tom Sterling and Beowulf in Chrysalis

Beowulf is NOT dead, but it is not the same either.

Originally Published in the December 2003 issue of ClusterWorld Magazine.

In this, the inaugural issue of ClusterWorld, I am pleased to have the opportunity to reflect on a very different era in high performance computing only a decade ago and consider, too, some ill turned words I spoke but a few months ago: “Beowulf is dead.” Taken out of context, as is likely for any such provocative quip, it has evoked some degree of consternation from elements of our community. This statement has been assumed to convey a false and unintended meaning; that the possibility of low cost commodity clusters comprising an integration of mass market computing elements using commercial off the shelf network technology is no longer a viable means to computing large workloads and exploiting parallelism for reduced time to solution. Clearly, this is not the case. One need but to examine the Top500 list to discern the trends in clusters and constellations that demonstrate their dramatic impact on a field that, through their rapid growth, they now dominate. Beowulf is not dead. And yet, for those of us who recall the early challenges of bringing ensembles of PCs to the realm of scientific and technical computing, Beowulf is no longer the same, either.

Only a decade ago, commodity clusters were in their infancy. For those investigating their potential, the world of high performance computing was very different and even unwelcoming. It was a world ruled by big iron, and custom design was the game. A Gigaflops could easily cost a million dollars. Researchers at UC Berkeley, the NASA Lewis Research Center, and other institutions began to apply small farms of vendor workstations to computational workloads. When Don Becker and I created the first Beowulf PC cluster with the little known and still inchoate Linux operating system, few supported us and many major players were borderline hostile even to the very concept. With the goal of realizing truly inexpensive parallel computing in a form that almost anyone could exploit, the technical challenges were daunting. If we had undertaken the Beowulf project a year earlier, we would have failed. All hardware components had to be available and derived from other market products – there was no cluster market, no existing products for cluster systems. Software had to be available as well that would support clustering of PCs and apply them to applications or at least to provide the framework for building our own software to fill in the gaps. Even then, the Beowulf project was responsible for providing the majority of Ethernet drivers for the Linux operating system. We didn’t have any choice. Beowulf was forged in the crucible of opportunity and vision. But it was not readily embraced by an industry dominated by cold war economics. Indeed in some minds, Beowulf was an affront to the status quo.

Ten years later, the cluster world is very different. While there are still important applications that are not well suited to commodity clusters, there are many more problem areas that have benefited from the low cost scalable computing that commodity clusters provide. And the problem domain has extended beyond that of scientific and technical applications to encompass financial, commercial, web search engines, data mining, and many more. No longer is conventional wisdom in opposition to the exploitation of commodity clusters. Quite the contrary, now the mainstream has become commodity clusters and Linux has become one of the operating systems of choice while some other installations employ Microsoft Windows O/S. But even as clusters have gained preeminence, they have also evolved. The original Beowulf systems used PCs as they were originally packaged, usually directly in their tower cases. This required no changes in the manufacturing but it was difficult to assemble large clusters. As the cluster market grew, vendors emerged to support it, producing rack mounted units to replace the awkward towers, increasing node density and improving reliability. More recently blades have further increased the node density with hundreds of node per rack now feasible. When first local area networks had to be adapted to the purpose of providing cluster interconnection, currently there are multiple commercial system area networks designed for the purpose including Myrinet, Quadrics, and SCI with Infiniband emerging as a possible additional network technology for clusters. System software and tools have also evolved to serve Beowulfs with both open source and commercial software packages available providing most of the capabilities and services found on conventional systems.

Beowulf has emerged from a chrysalis of change, no longer relying solely on what was available but now benefiting from industrial investment to better support and service the commodity cluster market. These changes have not stopped. As this new publication demonstrates, the world of clusters is likely to continue to evolve in response to technology opportunities and new market requirements. Performance is approaching 1 Teraflops capability per rack with cost below a dollar a Megaflops. But also of increasing importance are reliability, maintainability, and manageability of large cluster systems. Many developments in the near future will focus on these key issues, increasing their utility in production environments. Application program construction and porting to clusters will experience significant advances in the near future, increasing the impact of clusters across a broader range of problem domains while improving programmer productivity and overall time to solution. Beowulf changed the world of computing and its future vitality will be based on its own ability to continue to change. We may expect that commodity clusters will continue to create a cluster world.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62