Another voice of concern over parallel programming. Plus a wild idea.
Recently someone sent me a link to a blog called The Perils Of Parallel. The subtitle has a nice message: Comments related to the book I’m writing: “The End of Computing (As We Knew It)”. It’s about the potential black hole of explicit parallelism into which the industry is heading. Yes, I shouted. An ally in my parallel programming malaise. As I looked at the blog author, I immediately recognized the name — Greg Pfister. He is the author of a great book called In Search of Clusters, 2/E. This book was first published in 1997, but is still relevant today. In particular, his discussion of software on clusters and SMP machines makes a very convincing argument that although both are “parallel” there is very little overlap with program design. Essentially, they are two very different computer architectures.
The one blog entry that caught my eye was 101 Parallel Languages Part 2. In this entry, Pfister makes an observation about parallel languages in the HPC sector:
So they’ve got the motivation, they’ve got the skills, they’ve got the tools, they’ve had the time, and they’ve made it work. What programming methods do they use, after banging on the problem since the late 1960s? No parallel languages, to a very good approximation.
Before all the MPI and OpenMP fanboys jump out of their seats, let me say I agree with this quote. OK, now you can jump out of your seats. If you think about it, MPI is pretty much the predominant method used in HPC. And, MPI is an API (Application Programming Interface) not a language. For the purposes of sending messages and converting existing codes to parallel, MPI works quite well and I applaud those who have helped with these efforts. I do think, however, we need to be looking beyond MPI.
In other blog entrees, Pfister also talks about the “killer parallel application” — that is, those that are “embarrassingly parallel.” Such an application has not surfaced yet and we are all waiting for the Lotus 123 of parallel computing. I have some thoughts around this type of applications, but I’m not sure it will be embarrassingly parallel.
In the past, I have mentioned that Artificial Intelligence (AI) has potential in parallel computing. AI problems are hard and some require large amounts of computing resources. Please note, although AI has gone though its ups and downs like any other over hyped technology, it has been making steady progress. And, AI means different things to different people. AI is being applied in many places and solving fuzzy problems that do not lend themselves to strict or formal analysis. Indeed, AI systems often need to search large solution spaces or perform many repetitive asks which is a natural fit for parallel computation.
One interesting aspect of AI is that after an AI method gets absorbed into to the mainstream, it becomes, well, “mainstream.” As an example, Bayesian statistical analysis started as an AI method, and now it is an important component in email spam filtering. Most people don’t refer their spam filters as intelligent or possessing some kind of “AI.” They just work. In my opinion, another application that is very sophisticated and “intelligent” is a good compiler. The analysis and optimization methods that are done by today’s compilers would probably be called magic thirty years ago. Today they are standard and often taken for granted. When you type make you unleash a very sophisticated series of events that would be difficult for your average human programmer to understand. Like quantum mechanics, I'm not sure any one really understands Makefiles in any case, some of which are highly parallel (see the -j option)
An other area that has shown promise on clusters is Genetic Algorithms (GA). While some may not consider GA's strict AI, I think it is close enough. GA's are best used to solve difficult search and optimization problems. While the answers may be approximate, the ability to solve really difficult problems with brute force computing makes them attractive. And, GA's are naturally parallel.
Let's recap. We considered Pfister's well thought out discussion on the difficulties of parallel programming. From there I mentioned AI, then compilers, and finally GA's. Do you get where I'm going with this?
What if a high level programing description language was developed. Note I did not say programming language. This description language would allow you to "describe" what you needed to do and not how to do it (as discussed before). This draft description would then be presented to an AI based clarifier which would examine the description, look for inconsistencies or missing information and work with the programmer to create a formal description of the problem. At that point the description is turned over to a really smart compiler that could target a particular hardware platform and produce the needed optimized binaries. Perhaps a GA could be thrown in to help optimize everything.
This process sounds like it would take a lot of computing resources. Guess what? We have that.
Why not throw a cluster at this problem. Maybe it would take a week to create a binary, but it would be cluster time and not your time. There would be no edit/make/run cycle because the description tells the compiler what the program has to do. The minutia (or opportunities for bugs) of programming whether it be serial or parallel would be handled by the compiler. Talk about a killer application.