dcsimg

The Total Cost of Parallelization

Before tucking into that next project, developers really need to start asking themselves, "What's this going to cost me?"

I have calmed down since my previous rant. Looking at the comments, it seems I have stepped on more than a few issues. While I can’t address all the comments, I will make a general statement. There were some suggestions that solutions already exist. And, indeed, I can’t argue with that; you can write parallel programs in shell if you so desire. The big issue is the abstraction layer.

As a developer, I don’t want to think about the number of cores or nodes any more than I have to think about the addressing modes, registers, and floating point units in the processor. If parallel computing is going to truly be cost effective, then reasonable abstract layers need to exist. Many developers and most vendors will acknowledge an urgency to achieve the goal of simplified parallel programming lest we face some dire consequences going forward. The solution I have proposed is a shared GPL approach. If anyone has a different idea, I would love to hear it.

Moving on.

What’s This Going to Cost Me?

In the not so distant future — or last week depending on your development schedule — you may be tasked with building an application for multi-core (i.e. parallel) processors. The first question you may ask yourself is what language should I use to code this project? A good question and like any good engineer you should ultimately be considering the cost required to meet the design goals.

Before we go much further, I want to introduce a metric that will become increasingly important as we journey though worlds parallel: “total-cost-of-parallelization” or TCOP for the time-constrained. This particular metric is similar to “total-cost-of-operation” (TCO) or “total-cost-to-solution” (TOS) and defines the overall cost any parallel application, including hardware, software, planning, development, and deployment.

Let’s refine our goals so that we can perform a small thought experiment. We want a scalable application that will capable of running on 4, 8, and eventually 32 cores. Not an unreasonable request given the current multi-core march.

Here is something that may not sit well with most programmers. If your application can scale (i.e. increase performance as you add cores), then single core performance is not all that important. Gasp. Head-shaking.

To explain my point, consider the following scenario. I have two versions of the same program that we’ll call Application X. The first version of Application X is written in C (I consider C the universal assembler language). The second version is written in a scalable language that we will call SL.

The big difference here is that the SL (scalable language) allows Application X to use extra cores if they are available. Because the SL abstraction layer has moved you, as a developer, away from the nitty-gritty details of the processor cores and closer to your application you should immediately see two benefits: 1) The program should be easier to write and 2) the source code footprint will be smaller. And, possibly, a third: Sooner time to deployment.

Now, for the purposes of our thought experiment, let’s further assume that the SL version of Application X is five times (5x) slower than the C version on a single core and scales at about 80%. Let’s look at what happens as we add cores:

Cores C
seconds
SL
seconds
1 10 50
4 10 15
8 10 8

Now the question becomes which is the faster application? Assuming the SL version continues to scale, the version written in C seems to be at a disadvantage.

Of course, one solution could be to make the C code parallel using OpenMP or pthreads. Fine by us, your funeral. And, while you are at it, track your time/cost to make this conversion. All of it. From when you start paging through the OpenMP documents and examples all the way to debugging and to scalability testing. My claim is that this cost is going to be non-trivial.

The High Cost of DIY

Now ask yourself. What gets more expensive every year? Right, people. Conversely what gets less expensive every year, Right again, hardware. With your engineer hat on, think it over. If you write your version of Application X using an SL (scalable language) from the beginning, your overall costs may end up being much less because you have eliminated a large amount of expensive “programmer” time. On the other hand, by sticking with the standard language, you have possibly programmed yourself into an expensive corner. Additionally, focusing too much on single core performance may also limit your options. Of course, you want the code to run as fast as possible on a single core, but we need to weigh that performance against the cost of creating an overall scalable application.

One could argue that implementing the C program in parallel from the start will provide the needed scalability. Nice thought, but you are just moving the work from the back-end to the front-end. Explicit parallel programming is expensive. To be fair, there is also the one time start-up cost for learning the SL that needs to be factored in as well. It all depends on the TCOP (Remember that term!) of your project. As the core count continues to grow, so will the cost of programming with your current and comfortable software tools.

Next Time

Last week there was a comment that inquired about the role of Erlang. We’ll tackle languages like Erlang and how they can impact your TCOP next. If you are well schooled in the procedural approach to programming you will probably throw a hissy fit.

Comments on "The Total Cost of Parallelization"

ekerim

Congratulations! Within the constraints of the example you have now managed to pay approx $8.000 for an 8-core N Ghz machine that barely beats a $1.000 1 CPU N Ghz machine performing this one task. How’s that for TCOP ?

Your example has no base in reality and your reasoning/conclusions are flawed.

You can not talk about scalability and use an example that assumes that only one task will ever be performed at the same time. In a real world scenario, the C app would run eight separate instances each bound to a sparate core and calculating using an efficiency loss off 20% for each core compared to an eight CPU system (or eight single CPU systems) it would complete 40 tasks in the same time it takes your hypothetical SL app to complete 8 tasks (one for each core). Yes, that’s right, the speed factor is still 5 to 1 in favor of the C app.

Off course the example is overly simplified, when programming in parallell there are other bottle necks but those are irrelevant to the example.

Using statistically non-significant tests, I found that a single threaded app which uses 100% CPU on a single core and who’s only possible bottle necks are memory access and the system bus, scales with an efficiency of ~99% on an 8 core machine.

I totaly agree that something needs to be done to ease the task of programming in parallell, but the reasons has nothing to do with the efficiency of the programs code as you try to make it out to be.

The real problem is the maintainability of the required complex code and the time to debug threading errors. This complexity and time must in real life be compared to the availability of programmers with expertice in C/C++/C# (pick your poison) contra the availability of programmers with expertice in hypothetical SL, an unproven language until it has been around for a couple of years. Hardly enough to try to coin a new expression over is it ?

Reply
spacemonkey

re: “What gets more expensive every year? Right, people. ..”

I’m not sure this premise is quite right. Companies are always looking at ways to cut the cost of people and off-shoring can significantly reduce the expense and generally produce the same results. One could argue that off-shoring is getting more expensive every year, but there’s still a large (5-10x) differential in people costs between on-shore and off-shore. And there are always new cheaper off-shoring destinations to be ‘discovered’ awaiting their turn to join the race.

Reply
tphillips

To spacemonkey:

Same results with offshore developers?

You must be a bean-counter. Certainly you haven’t been on the same projects I have.

Always new, cheaper off-shoring destinations?

Not unless the planet is expanding. Even if, eventually, every nation on Earth develops the infrastructure for off-shoring, that’s still a limited pool.

Reply
spacemonkey

Hi tphillips,
just playing devil’s advocate here, but off-shoring is a reality. re: “Limited pool” ? With developing nations coming ‘on-line’ the pool is getting bigger all the time…

Reply

I wonder, is this is a function of the audience that Cuban reaches with his blog or if open source startup and quick profitability just don’t go hand-in-hand? multisportred
soulnsportventures
sportahead

Reply

Wow! Thank you! I always wanted to write on my site something like that. Can I include a part of your post to my website?

Reply

Spot on with this write-up, I actually suppose this web site wants far more consideration. I’ll probably be again to learn way more, thanks for that info.

Reply

Very few internet websites that take place to be detailed below, from our point of view are undoubtedly very well really worth checking out.

Reply

Here are some links to internet sites that we link to simply because we assume they may be worth visiting.

Reply

Check below, are some completely unrelated sites to ours, on the other hand, they’re most trustworthy sources that we use.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>