I knew this day would come, well month actually. It is the “more cores” month and an important point in HPC history. Synchronized by competitive forces, both Intel and AMD are soon to release the next level of their respective multi-core processors. Why is this different than the past? For HPC it may have some real consequences. It is now possible to have nodes with 24 cores using AMD Opteron processors. This number will double. As I will mention in a moment, Intel is bumping up the cores as well.
Why does this matter? Consider that International Data Corporation (IDC) has reported 57% of all HPC applications/users surveyed use 32 processors (cores) or less. In other words, in the age of thousand core clusters, most applications can only use 32 (or less) cores before they see no further performance gain. These numbers are confirmed by a poll from ClusterMoney.net where 55% of those surveyed used 32 or less cores for their applications. At the high end, greater then 128 cores, the number of applications increase, leaving a valley of poor scalability between 32 and 128 cores. Therefore, at least 50% of the HPC market can, in theory, get away with using a single 48+ core node.
The HPC market could be in for a big change. I mentioned this in a previous article called Small HPC. I continue to have several questions about our many-multi-core future:
Will low end HPC migrate to the desk side for those needing less than 64 cores?
Will MPI continue to be the dominant method of coding?
Does having 48+ cores on a single node make sense for HPC applications?
Should we use more cores or less cores and a GP-GPU on each node?
I should mention that you can now get 24 AMD cores in a workstation and I have not heard of any large migration to these kinds of platforms. Perhaps it is not as simple as core counts. There are other concerns for the HPC user such as parallel I/O, memory contention, GP-GPUs, and practical concerns such as heat, power, and noise. (i.e. forty-eight cores next to your desk running flat out for 12 hours may be your idea of a personal space heater, but not computer.)
It is possible that there may be a “sweet spot” for the number of cores per node based on I/O needs. Another possibility is that with the large number of cores, applications will be scaled up to use more cores. What is the best way to code for a combined multi-node multi-core GP-GPU environment? I have no idea? None.
I should also mention that there are clusters with 16 cores per node in operation. The fact that both AMD and Intel are going to be raining down many more processors in the 6-12 core range in the near future is what gives me pause. From the desktop to the high end we a going though another “core bump” and how it plays out in HPC will be very interesting.
So what is coming down the pike? Of course no one knows exactly (or if they do, they are not allowed to say), but there are some big changes in the winds. These changes will include more cores 6, 8, and even 12 per socket and more sockets per motherboard. More cores is expected, but the number of cores limits the thermal dissipation on the processor chip. i.e. more cores means more heat. Thus, Intel is releasing processors based on a 32 nm fabrication process (down from 45nm) which will allow more cores per processor. AMD has similar plans for the future.
Intel is expected to announcing a six-core processor based on the Gulftown architecture. It will run up to 12 threads in parallel and begin the Xeon 5600-series. As reported, it will be compatible with the LGA 1366 socket, and thus after a BIOS upgrade be compatible with most current motherboards. Some reports indicate that at equivalent clock rates, it provides the expected 50% performance boost above a quad-core processor and maintains the current power envelope.
There is also the Nehalem-EX, an 8-core Nehalem based on the Beckton architecture. It is supposed to have four or more QPI (Quick Path Interconnect) links allowing for more processor sockets per motherboard. An interesting spin on the Nehalem-EX is the the expected launch of an HPC version. Intel is promising a 6-core version that will run at higher frequencies than the 8-core version (less cores can run faster because there is less heat). From the hints in the media, it may have as many as eight QPI interconnects making a 1536 core system possible using a hypercube arrangement.
AMD has some news of its own to offer. First, the long awaited Magny-Cours is a 12-core processor that will operate in the 2.2Ghz range and include extra memory channels. Remarkably, it is said to run cooler at idle when compared to the current 6-core Opteron. It is reported that AMD has used Intel’s trick of placing two multi-core processors in one package to bump the core count (e.g. The Intel 4-core Clovertown and Harpertown were manufactured this way using two Woodcrest processors). The key to this strategy is not limiting memory access to and from the additional cores. AMD understands this concept as it has always delivered leading performance in memory bandwidth. There is also an expected speed bump for the current 6-core Opterons, which will be welcomed as well.
In closing, 2010 will be an interesting year for HPC. Once again the hardware possibilities far exceed the software capabilities. Indeed, as the multi-core is pushed down to the desk top, six and eight core single socket systems will be come the norm. How much HPC can you get done on eight cores and a GP-GPU? How about a Beowulf cluster of these …
Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62