Processor Bifurcation

The processor market is diverging between two paths, the general and the predictable. Where does HPC hitch it's wagon?

There is a big word in the title. You are either reading this because, you don’t know what it means and want to find out or you know what it means, but have no idea what it has to do with microprocessors. Either way, I’m hoping you are hooked because I think we are witnessing a second dramatic change in the computer market (the first being multi-core). As many of you know, I have had many a sleepless night when it comes to multi-core. In the fits of my obsession, however, I may have missed something.

The commodity microprocessor (i.e. x86) has been the juggernaut of the computer industry. For HPC clusters the turning point came when the Intel Pentium Pro was released. It was fast enough for HPC (thanks to a much improved Floating Point Unit), it worked well in dual processor servers, and it made for great workstations and desktops. All the same part at the same price. Then the marketing department got involved. Why charge the same for a “server” processor as a “desktop” processor when servers cost much more than desktops. Thus, began the Xeon age and with it the server, workstation, and desktop markets were created by adding/deleting features from a baseline processor design. Great marketing — however, the processors were basically the same. In a way, it reminds of the old Sunoco gas pumps where you could mix your own octane grade. The price of course increased with the octane. Now we have regular (desktop), midrange (workstation), and high test (server). In case you don’t read Consumer Reports, it is basically all the same gasoline.

I currently have a small four node (single socket) Core 2 Duo cluster that can achieve 47 GFLOPS on the Top500 HPL benchmark. I built the cluster in 2007 for a cost of $2300 (That is less than $50 per GFLOP). How is this possible? Inside the nodes, the processors are basically the same as those used in the servers that crank out the big Top500 ratings. Of course, marketing lowered the octane on my processors, but they still come from the same refinery as it were. The better grade fuel may help you go faster, but not that much faster.

While I’m using automotive analogies, let’s introduce the megahertz wall. You know, the big cement wall in the middle of the road that all the fast and hot single core CPU’s crashed into. Yea that one. And, thus the multi-core era began. This is where things get interesting. In the server market, multi-core is actually a good thing. Much of what a server does is stateless or at least independent, so more cores means more throughput. Throw in virtualization and the whole multi-core thing does not seem so bad. Moving on down to workstations (and in a sense HPC clusters), multi-core is again kind-of welcome. More cores, more work gets done, but there is the programming issue. I have talked at length in the past about this issue and will continue to do so in the future.

Let’s take a look at the desktop market. Multi-core really does not make much sense here. Spreadsheets, word processors, and browsers run pretty well on slow processors. Of course there is multimedia. Now there is a core-sucking application area if I ever saw one. However, any multimedia/gaming fanboy/fangirl worth their bits has a high end video card specifically designed to do the grunt work. The cores could do this kind of processing, but the results would be slower as cores are designed for general purpose computing.

At this point, a small sidebar may be helpful. If we look at how computers are used, one could make a general distinction between two types of computing: predictable and general/non-predictable. Some examples will help. If I have to multiply two large matrices together that is a highly predictable operation. If I have to service database requests, that is very non-predictable operation. A general purpose processor can be used for all types of computing, but it must dedicate silicon to handle all kinds of unpredictablness. A GPU (Graphics Processing Unit) on the other hand, is designed to perform predictable computing. The types of operations are similar, very repetitive, and parallel. If we were to look at
the desktop/laptop PC we find a a mix of uses. A word processor is non-predictable as is running a desktop windowing system. On the other hand, multimedia presents a very predictable and parallel type of computation. Moving to the high end, a web/database server is obviously non-predictable. When a multi-core server is used in a cluster, it is doing highly predictable computing. With that in mind, let’s take a hard look at where things might be headed in the future.

As mentioned, most desktop/laptop systems do a mix of computing. However, it is safe to say that most desktop applications do not need all the horsepower available in the current multi-core offering. Dumping in more cores is probably not going to make that word processor much faster. Dumping in more GPU hardware will help when it comes to multimedia and that seems to be where desktops are headed. Because the GPU hardware is specifically designed for predicable computing, it will work much better than extra cores. What may serve this market better is an “average” processor coupled with massively parallel Predictable/Parallel Computing Units or PCU for short (i.e. another way to say GP-GPU). This design is similar to the IBM/Sony/Toshiba Cell Processor. Indeed, if NVidia were to drop general computing core into one of their Tesla products, I would not be surprised. Considering that AMD has FireStream and Intel has a powerpoint version of Larrabee, there may actually be something going on here. It should be mentioned that the world’s fastest computer (as measured by the HPL benchmark), RoadRunner at Los Alamos National Lab, uses Cell processors.

The fork, or bifurcation, in processors may soon be upon us. The traditional multi-core for non-predictable general computing vs. the single core plus PCU unit for predictable computing may be the way we look at things in the future. In a year, there may very well be two paths in commodity processor development. The first will be the growth of multi-core (more cores per die) and the second the growth of PCU processors. Unlike, my gasoline analogy, these designs will be very different and have very different performance profiles.

If my prognostications are true, there is one big question on the horizon. What is best for HPC and clusters? The startling performance of CUDA enabled applications from NVidia has people wondering. I would guess this is just the first of many such successes (although good luck trying to pull that PS3 out of your kids hands so you can play with a Cell processor). HPC is predictable computing and the desktop/game console market may turn out to be a bigger driver of HPC than the server market. Interesting times lie ahead. Maybe my anxiety about multi-core has been misplaced. Perhaps, everything will be alright, but then how will we program large numbers of PCU’s? In the end, it always seems to be about the software.

Comments on "Processor Bifurcation"


core-suck is my new favorite phrase.

I’m sorry, Sunoco gas pumps? I’m only 35 so you might need to need to update your analogies a bit. ;-)


Niche CPU’s were never viable because of Moore’s Law effect on single core performance. If you spent a large amount money on research and development to make a processor that ran a specific group programs twice as fast it would be outdated by a general purpose CPU in 5 months. Which isn’t enough time to recover R&D costs in a niche market.

This means that nonparallel dependency ridden programs will start having niche hardware produced. You are starting to see this is places like Azul. Your right we are gonna start seeing a lot of interesting processors, and there will eventually be even more forks down this processor road.


Azul? that is amusing. They are on the first floor of the building I work in (that should make it obvious who I work for if you know where Azul is located). They have been losing money (did they ever make money in fact?) and little by little my company has been taking over their space and taking more of the spaces in the partking lot. Niche hardware just doesn’t make sense financially. Your first paragraph is correct, but then your second paragraph seems to contradict that.


While considering High Performance Computing, there are a few factors that need consideration:

1.) Whether the application/software is freshly designed, keeping in mind the hardware/processor that it is going to run on, or whether it is a port of a legacy code or an enhancement of an existing code. Considering on how the software industry has evolved; and the amount of money poured in to the maintenance of an existing software product, it is not very likely that software giants will be very keen to re-develop. The emphasis is mostly on re-using the existing code and getting them to work on new platforms. The fact is that most of this code is nonconformist to the new multi-core processor (and are not thread aware) and is not able to exploit multiple processors either. Most of these codes will are light weight and do not need the performance of a multi-core system.

2.) Self modifying code will play an important role in the future of “intelligent” products and predictive computing, and they are still a handful to manage on single core processors. The introduction of multiple cores will further complicate the working of such code.

3.) The introduction of multiple cores (and multiple processors) on a server also introduces the problem of synchronizing different execution flows on different processors/cores, and the possibility of deadlocks. Greater the number of processors/cores, greater will be the time wasted by the individual processor for synchronization. Thus it is not necessary that the increase in the number of processors/cores will translate to increase in performance. Remember the old adage: “Too many cooks spoil the broth”.

The need is to be able to segregate the utility of the server/computer. Gaming needs multiple cores and will have to take a separate path, but servers running normal applications can still work with a single core. The cost of maintaining a complex application is huge, and it is unlikely that the end user will be keen to bear it.


Hi there just wanted to give you a quick heads up. The words in your article seem to be running off the screen in Ie. I’m not sure if this is a formatting issue or something to do with internet browser compatibility but I figured I’d post to let you know. The design look great though! Hope you get the issue solved soon. Cheers


Awesome post. I’m a normal visitor of your blog and appreciate you taking the time to maintain the excellent site. I’ll be a frequent visitor for a really long time.


I see you don’t monetize your blog, i’v got idea how to earn some extra money using one simple method, just
search in google for; Ferdeck’s tricks


Although web-sites we backlink to beneath are considerably not related to ours, we really feel they are basically worth a go by way of, so possess a look.


The information mentioned within the post are a few of the most effective accessible.


Here are some hyperlinks to web sites that we link to mainly because we feel they are worth visiting.


Every after inside a while we decide on blogs that we study. Listed beneath would be the most up-to-date sites that we pick out.


The information and facts mentioned inside the write-up are some of the most effective available.


The time to read or check out the content material or web sites we have linked to below.


Below you will find the link to some internet sites that we believe it is best to visit.


Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>