Tell me more, tell me more...
Last week I put my Erlang flag in the ground as it were. I was expecting a raft of flaming comments. Did not happen. I doubt there was a collective head nod from my vast audience, and I wonder why so quiet? Perhaps everyone is on vacation, or getting ready for vacation and have put off reading my insightful column for when they take a vacation. Speaking of vacations and beaches, that includes me. Weekly columns are still required, however. I have set aside some beach time to continue reading Programming Erlang: Software for a Concurrent World by Joe Armstrong. I have not yet, however, brought myself to bring my Eee web-book to the beach to play with some code. Bright sun not withstanding, I fear the union of silicon, both refined and shoreline, will not be a happy affair. Although this Asus photo might seem to indicate otherwise.
If you have a chance to read the above mentioned book, you will notice that programming with Erlang is different. As I mentioned previously, Erlang is a declarative language and as such should give “procedural” programmers headaches. Not to worry though since once you get your head around the functional/declarative nature of the language it starts to make incredibly good sense. Time for a confession. In years past I had implemented and used the Prolog language. Erlang has some features found in Prolog, but removed some of the more esoteric aspects i.e. backtracking. Based on my experience, as a trained procedural dude (mostly C and Fortran), once I “got it” Prolog became a very powerful language. Indeed, I found I could easily craft a program in less lines of code, less time, all the while focusing more on my application and less on the machine aspects. Let me be clear (should I say “declare”?) about my intention, I am exploring Erlang as a multi-core/multi-node language. I am not evangelizing the language nor am I standing on a soapbox telling you to use Erlang. I’ll let the code speak for itself as I explore Erlang in future columns. For now, I want to present some of the “big picture” features that make Erlang attractive to me.
One of the things I find interesting is that Erlang was designed for concurrent operation from the very beginning (circa 1988). That is, before multi-core, before Linux clusters, and maybe even before some of you were born. Erlang was developed by Ericsson to support telecom applications where massive concurrency (e.g. tracking phone calls) and having good support for fault tolerance was a must. Since no language could meet these constraints, a decision was made to create a new language. Good ideas from existing languages were incorporated as needed and thus Erlang was born.
Jumping ahead to today, the concurrency feature seems like a real good idea given the whole multi-core thing. Unlike more traditional languages, Erlang concurrency is a fundamental part of the language. And, most importantly, it is simple to use. A typical Erlang program consists of many separate processes. Each process owns its own memory and can only communicate with other processes though message passing. There is no shared memory in Erlang. Therefore, there are no locks, semaphores, etc. that are required to support “shared state” between processes. This design is liberating. Each process essentially takes care of itself and basically does not give a damn about other processes (this is not strictly true, but it makes for good drama). In addition, Erlang processes are independent, light weight, and owned by the program and not the operating system. Therefore, creating/destroying Erlang processes are very fast as is communication between processes. Remember the only way a process communicates with other processes is through a message — like you and me.
Working with processes is very simple. There are three basic operations: spawn, send operator, and a receive. For example, a message in Erlang is sent as follows (Note some language features have been omitted for clarity and
% is a comment.)
Pid = spawn(fun some_function)
%Process Two (some_function)
% do something
Process One spawns some_function and a Pid (unique process ID) is returned. It then sends some_message to Process Two using Pid. Meanwhile, Process Two, will wait for a message and act accordingly when it is received. That pretty much describes the basic Erlang communication scheme. Of course, communications can be more sophisticated than the above, but the basic flow is essentially the same.
Erlang was designed for fine grained concurrency. Programs can easily use hundreds and even thousands of processes. Think of them as a kind-of subroutine thread in procedural languages. Only, they have not global variables. (Not to worry, I’ll talk about global data in the future). Erlang processes can live on a single processor, multiple cores, or multiple machines. This feature is transparent to the user. For instance, a concurrent Erlang program could be easily written on say and Eee PC at the beach, then run on an eight way server back at the office. No code changes are required and in most cases the programs will run faster. The program developed on the Eee PC can also be massively concurrent supporting many thousands of processes if necessary. Processes can be spawned on remote nodes as easily as the same node. Of course there is some hostname and security issues that need to be addressed, but the spawn,send, and receive dynamics are exactly the same.
Interestingly, and Erlang program is in some ways like an MPI (Message Passing Interface) program running across cluster nodes. Memory is not shared between nodes, only messages. Indeed, MPI programs can run on a single multi-core node and maintain the same private process model. Of course, messages may be passed though shared memory, but each process can only touch its own memory. Once difference between Erlang concurrency and MPI concurrency is the granularity. An MPI program uses messages to designate specific user selected concurrent parts of the program. In an Erlang program everything is concurrent (or as concurrent as possible). Thus, in Erlang the concurrency is always there if you need it because that is how you wrote the program.
It’s time to hit the beach, so I’ll share one final thought for this week. Erlang is often referred to as a “telecom language.” I think this is a mis-characterization. Of course, Ericsson designed Erlang for their needs, but it does not mean it can only be used as a “telecom” language — whatever that means. History is with me here. In case you forgot, UNIX and C were developed by AT&T to help manage a phone network. Now where did I put my speedo?
Douglas Eadline is the Senior HPC Editor for Linux Magazine.