Dual-Core Calisthenics

Got performance? A simple test provides a peek into the AMD and Intel Dual-core processor designs.

Ever since dual-core processors became widely available, I’ve wanted to do some tests. Sure, technology Web sites offer the results of a battery of standard tests on the latest and greatest chips, but I typically find these tests lacking, largely because the benchmarks have little to do with the demands of a cluster.

Recently, I had the opportunity to benchmark a Pentium D processor, thanks to Appro International (http://www.appro.com). Now, some may argue that the Pentium D isn’t a “cluster processor.” I’ll leave that argument for another day. The real problem at hand is memory contention, or how well do the processors share memory. For example, does one processor or core reading or writing memory cause the other processor or core to wait? Now that we live in the multi-core age, the question is becoming very important. Both Intel and AMD have radically different approaches to multi-core. Motherboards and chipsets play an important role as well.

Like all benchmarks, your application is the best measure of system performance. The simple test I demonstrate this month provides a small but important peek behind the CPU curtain.

Time for a Script

A good question to ask is, “If I run two identical programs at the same time on a dual core, one on each core, does performance suffer?” To try an answer this question, let’s develop a simple test: If a program takes one minute to run on a single core, how long does it take if the same application is running on the second core at the same time.

If there’s no contention between programs, both should run for one minute. If there is maximum contention, each should take two minutes. The benchmark seems simple enough and a little bash scripting is in order, but first an equation will be helpful.

After a little twiddling with numbers, this simple formula yields a speedup measure that ranges from one to two:

 Speedup = ((CTIME1+CTIME2) * STIME)/(CTIME1*CTIME2) 

CTIME1 and CTIME2 are the wall clock times for the first and second copy of the program running concurrently (at the same time), respectively. STIME is the wall clock time to run one copy of the program. Using the equation, if CTIME1= CTIME2= STIME (perfect speed up), the improvement is two. If (CTIME1+CTIME2)=2*STIME (no improvement), the formula yields one, indicating no improvement.

A script to carry out the tests and compute the speedup is shown in Listing One. The specific programs used are listed on the PROGS= line. To use the script, replace the entries on the PROGS= line with your own programs, place the binaries in a local bin directory, and let it run. (Hack as needed.)

Listing One: A test script to measure the performance impact of a dual-core processor

#! /bin/bash

# Add your own list of programs here, assuming they’re in ./bin
PROGS=”cg.A.1 bt.A.1 ep.A.1 ft.A.1 lu.A.1 is.A.1 sp.A.1 mg.A.1″

echo “SMP Memory Test” |tee smp-mem.out
echo “`date`” | tee -a smp-mem.out
# generate single cpu codes change -c for different compiler

for TEST in $PROGS
do
bin/$TEST >& temp.mem0
bin/$TEST >& temp.mem1 &
bin/$TEST >& temp.mem2
wait

SINGLE=`grep Time temp.mem0 |gawk ’{print $5}’`
DOUBLE1=`grep Time temp.mem1 |gawk ’{print $5}’`
DOUBLE2=`grep Time temp.mem2 |gawk ’{print $5}’`
SPEEDUP=`echo “2 k $DOUBLE1 $DOUBLE2 + $SINGLE * $DOUBLE1 $DOUBLE2 * / p” | dc`

echo “SMP Program Speed-up for $TEST is $SPEEDUP” | tee -a smp-mem.out
done

/bin/rm temp.mem*

echo “`date`” | tee -a smp-mem.out

Next: The Numbers.

Comments are closed.