Feedback from the readers of Linux Magazine

One reader takes issue with Donald Becker’s assessment of MPI.

Reality Check

As I read Donald Becker’s article “Beyond MPI”[ published in the November 2005 issue of Linux Magazine, available online at http://www.linux-mag.com/2005-11/beyond.html], I was horrified to see such a luminary present so many errors, inaccuracies, and misleading statements. Here’s a list of problems with the article:
1.MPI is not static. MPI has had dynamic API support since 1996; LAM/MPI started supporting it in 1998; and support in other MPI implementations started showing up at around the same time. A column of mine in ClusterWorld Magazine (http://www.clusterworld.com/) made the assertion that MPI could be used for distributed programming using its CONNECT/ACCEPT primitives. Becker’s statement equating the size of a cluster with MPI_Comm_size(MPI_COMM_WORLD) demonstrates a fundamental misunderstanding (or deliberate misrepresentation).
2.Implying that MPI cannot handle versioning is silly. In general, if you run multiple versions of network communication protocols, you’re going to run into problems, unless the applications are version aware and are able to fall back to older protocols upon demand. MPI could certainly do this, but no one has[ yet] asked for the feature. Versioning — the popularly- and aptly-named “DLL Hell” problem — is by no means unique to MPI, and implying otherwise is a misrepresentation.
3.Claiming that socket-based programming is simple is rubbish. Try getting your average chemical engineer to write socket-based programs, handle faults in sockets, and then handle blocking and non-blocking behavior and proper buffering. How can you beat the simplicity of MPI_INIT() followed by MPI_SEND()?
4.Saying that MPI is not about communication is mind-boggling. MPI is all about communication. Sure, you don’t get streaming — at least at the API level — because the MPI layer hides the underlying implementation and simply moves data from one place to another as fast as it can. Doing simple things in MPI is easy (for example, sending point-to-point contiguous data); doing complex things in MPI is complex (for instance, using user-defined datatypes for complex data representations). But many of the complexities are quite similar in socket-based programming, not to mention that collective operations and other cases are often not suitable for sockets.
5.Fault tolerance is a major problem on all fronts. The MPI API doesn’t exclude the possibility of fault tolerance, nor does it force cascading failures when just one process (or node) dies. Fault tolerance is hard problem in part, because computer scientists (not just the HPC community) don’t have reliable strategies to respond to failures. Moreover, MPI doesn’t have a standard for fault behavior because use no one has wanted it until fairly recently. MPI implementations are, believe it or not, user-driven.
6.Socket-based programming isn’t suitable for massive I/O or storage area network-distance latencies. What about Myrinet, Infiniband, Quadrics, and the other emerging, high performance networks)? Sure, you can run IP over them, but you pay a hefty performance penalty. MPI is very, very good at hiding all this from the user. Got multiple NIC’s? Emerging MPI implementations can automatically stripe across them. The goal is to abstract the network from the API, something that MPI does extremely well.
7.MPI ABI issues are highly contentious at best. There are some who want them, and there are some who absolutely do not. Specifically, saying that all ISV’s want an MPI ABI isn’t true. If your application can suddenly run with any MPI implementation, your testing matrix and logistics multiply exponentially. Some ISVs will embrace this; others will not. Hence, saying that an MPI ABI has strong support in the developer’s community is a huge misrepresentation. Plus, there are also at least two projects working on making a “neutral” MPI implementation that would be a thin layer between the application and the back-end MPI. This solution has many of the qualities of an MPI ABI.
Is MPI perfect? Absolutely not. But there is a major difference between the MPI API specification and the quality of an MPI implementation. Indeed, not specifying a large number of things is one of the major accomplishments of the MPI Forum. This intentionally left huge opportunities for future development and adaptation, which MPI implementors are exploiting to push the research envelope and deliver commercial products (fault tolerance, collective algorithms, multi-network support, and so on).
Hence, if your MPI doesn’t support threads or dynamic processes, then you should use a different one. If no MPI supports what you currently want, then let your favorite MPI implementors know. Ask for features — and help pay for implementations.
Jeff Squyres, The Open MPI Project

Linux Magazine welcomes your suggestions and feedback. Please send letters to class="emailaddress">feedback@linux-mag.com.

Comments are closed.