A revolution, or evolution, is taking place in the computing industry. Multiple CPUs and multiple cores aren't new to high-end markets, but this is the first time that they're being mass-produced -- and every programmer needs to understand how to take advantage of multicore systems.
A revolution (or evolution) is taking place in the computer industry. Intel, AMD, and others have introduced new chips that utilize multiple processing units in a single package. Instead of having a single central processor, or brain, computers will now have multiple brains to run programs with. While this technique is not necessarily new, it is the first time these types of architectures have been mass produced and sold to the commodity PC and server markets.
The revolution will effect everyone who uses a computer. Multi-core technology is touching servers, laptops, and even game consoles. From an end user’s perspective, this change will (should) remain hidden.
However, the expectation for continued price to performance gains similar to those experienced over the past twenty years will remain. Programmers will find providing additional price to performance for multi-core designs a challenging task, especially since there is no silver bullet or automated technology that can adapt current software to operate on multi-core systems.
The Road to Multi-core
The computer market has long enjoyed the steady growth of processor speeds. A processor’s speed is largely determined by how fast a clock tells the processor to perform instructions. The faster the clock, the more instruction’s that can be performed in a given time frame. The physics of semiconductors have placed some constraints on the rate at which processor clock speeds can be increased. This trend is shown quite clearly in Figure One where the average clock speed and heat dissipation for Intel and AMD processors plotted over time.
From a power consumption perspective, it is clear that something had to be done. The continued climb in power consumption (and thus heat generation) would require additional cooling and electrical service to keep the processor operating. The solution was to scale out processor cores instead of scaling up the clock rate. The drop-off in clock speed on the graph indicates the delivery of the first dual-core processors from AMD and Intel. These processors are designed to run at a slower clock rate than single core designs due heat issues. These dual-core chips can, in theory, deliver twice the performance of a single-core chip and thus help continue the processor performance march.
Multi-core Road Maps
Both Intel and AMD are selling multi-core processors today. Both dual and quad-core processors are available today. From publicly available documents, the companies expect eight-way cores will be introduced in the 2009-2010 time frame. A rough chronology is as follows:
- 2005 Dual Cores
- 2007 Four Cores
- 2009+ Eight Cores
For servers and workstations which traditionally have two processor sockets available, this means the total number of cores per motherboard can easily reach sixteen by the end of the decade. In addition, both AMD and Intel processors are available in four and even 8 socket (using AMD Hypertransport) socket designs. Extrapolating this to eight way cores suggests that is that in the near future sixty four core servers are not an unreasonable expectation.
The evolution to multi-core processors provides a unique set of challenges and opportunities for the entire computing industry. On one hand, multiple cores means more computing power within the same space and power signatures. On the other, fundamental changes in processor architecture makes efficient use of these processors a bit more challenging than previous designs.
The challenge is one of software and can be summarized as follows:
Without modification, virtually all existing software will not take advantage of the extra cores available on todays processors.
In order to take advantage of multi-core, programs need to be able to do multiple things at the same time. This behavior is often called parallel computing. A parallel program, written properly, will perform faster than a tradition sequential program because it can distribute work to the available extra cores on the processor. (Think of the progress a construction crew can make vs. a single worker). A traditional sequential program can only use one core. Indeed, it is only aware of one core!
The Multi-core Impact
From an end user’s perspective multi-core can have an immediate impact on some work loads. If for instance, your work load involves performing tasks simultaneously, then a multi-core system will show some immediate gains. For example, watching a video on the web while your computer prepares a large file for printing is one example. While multi-core will help in these scenarios, the individual applications will never exceed single core speeds unless they are "parallelized".
Converting existing programs to run in parallel can range form trivial to complex. The same is true for new applications. Programmers will need to take the time to explicitly update and test codes for multi-core.
There’s no simple or automated method to create parallel programs, so the Multi-core Cookbook (MCCB) is intended as a resource to help with these efforts. In addition to the extra work required to create multi-core applications, there are also some new concepts that arise.
All multi-core systems, by definition, must share memory. Modern memory subsystems are designed to minimize contention between the cores, but there are times when specific areas of memory must service two or more cores. This can, in some cases, lead to performance problems or reduced performance.
Unlike traditional single processor/single core designs, each core must share local resources. In addition to memory, these resources include hard drives, network connections, PCI buses, and other components of a typical PC, workstation or server.
On multi-core systems the operating system determines which program will run on which core. The operating system tries to keep all cores equally busy. It does this by moving a process from one core to the other. In cases where the process is moved to a core that does not use the same cache or local memory bank, the process loses the advantage of cached data and performance may suffer.
New Types of Errors
Multi-core also introduces new types of programming errors. These conditions arise because there are new timing dynamics occurring between cores. Situations may occur where cores are both waiting for each other (deadlock) and freeze or they become un-synchronized (race conditions) and produce errors. Programmers will need to be aware of these, and other situations, when writing or converting applications.
A final issue is that of programming models. On a multi-core system, cores share data and communicate through memory. How this behavior is expressed by the programmer depends on the Programming model used to write code. No single standard method (programming language or API) can be used for all for multi-core programming. For the purpose of the Multi-core Cookbook (MCCB), we break the programing methods into two simple categories:
- Mainstream Methods: These are methods that provide open/standard APIs, have a history of success in parallel/concurrent programming, and have a robust knowledge/experience base (i.e. examples) to support programmers. The approaches represent low risk in terms of portability and future support, but may not be the best method for a particular problem space.
- Up and Coming: These methods represent new approaches that show promise, but do not have all of the qualities that would make them a choice for large projects. (i.e. they are still new and/or experimental in nature). While they often provide a better way to express parallelism, they do introduce a bit of risk as they are young technologies.
Of course, these are somewhat arbitrary designations with the intention of providing guidance to software developers. As, only the developer(s) know the scope and requirements of a project, these designations are guidelines with which to help navigate the options. It is our desire that the Up and Coming approaches become more mainstream as they are designed to address problems that are new now, but will be seen more frequently as multi-core systems become ubiquitous.
While the Up and Coming models are too many to list here, Mainstream Method consist of one of three approaches: Threads, OpenMP, and MPI. Threads use an Symmetric Shared Memory, or shared memory, model and have been in use for many years on more tradition multi-processor systems.
Another approach that has been developed to make the use of Threads easier is OpenMP. OpenMP uses a set of directives that allow programmers easier (but somewhat restricted) expression of parallelism for SMP (multi-core) machines. MPI (Message passing interface) is the standard way of expressing parallelism in HPC (High Performance Computing) codes. MPI was designed to pass messages from process to process on separate machines, but it can be used effectively on SMP systems as well.
Multi-core programming is a vast and dynamic area of the computing world. The MCCB is place is a good start for most programmers and will be continually providing resources, examples, benchmarks, and background for the multi-core revolution.