Gentoo is a source based distribution which lets the user decide how to optimize their system in many ways and includes building for a specific CPU architecture. Linux Magazine benchmarks four such options; i486, i686, pentium3, core2, and throws in Ubuntu for good measure.
The benchmarking system used is version 2.4.1 of Phoronix Test Suite (PTS), which is widely regarded as the most complete benchmarking tool for Unix systems. The tests themselves are broken down into categories, of which we use compression, cryptography, linux-system, mesa and x264. Low level hardware tests such as memory and disk benchmarking have been excluded.
There is one main issue with using PTS for this type of benchmark, we don’t want an even playing field. PTS builds the packages for most of its tests from source, so that each system will be running the same binaries. For our benchmarking, we want to have an uneven test – that is to say that we need each binary to be built differently in order to reflect each GCC optimization. Not only that, but we would like to build support for CPU instructions such as SSE.
To this end, there will be two sets of benchmarks used in this article. The first is PTS, where each package has been built with the specific -march CPU setting. The second will be, where possible, the Gentoo system binaries themselves, where not only have they been built with the appropriate -march setting, but also the respective USE flags for CPU instruction sets such as MMX and SSE. The Ubuntu tests system will run against the binaries from its default repositories. Hopefully between these two methods we can gain the most accurate comparison possible.
Whether an application uses CPU instructions or not is not restricted by the setting of GCC options. An application itself can still employ the use of SSE if it was programmed to, even though it was not built with the -msse GCC flag. This means that an i486 compiled binary can still use SSE instructions to improve performance. As such, we can expect some of the results to be very close. For example, the LAME MP3 encoder checks to see what instructions a CPU has and will use them whether built for an i486 or Core2 system. Due to Gentoo’s USE flags, we are able to disable SSE support checking in some of the applications (those which allow us to). In addition, all binaries built in the Gentoo system have NVIDIA’s VDPAU disabled.
Finally, we have tried to replicate the PTS tests for the Gentoo system tests, however this was not always possible. Each set of tests should be compared only with itself.
Let the games begin!
Here we see a slight performance increase when encoding with a more optimized system, however the margin is a lot smaller than one might have expected. Ogg encoding on the Pentium3 system yielded a 12% speed increase over the i486 system (in this test it was 2.5 seconds). Similarly, MP3 saw around 10% performance improvement over the i486 system.
A number of tests showed the largest jump from i486 to i686, with the exception of Ogg, where encoding time actually increased dramatically with the Core2 system. This test was re-compiled and run several times, which confirmed this result.
Generally, in order to reach full encoding potential the program itself must make specific use of additional instruction sets such as MMX and SSE. This is very apparent in the non-PTS FFmpeg Gentoo system test, where the i486 and i686 systems were not making use of these instructions.
This set of tests shows that Core2 comes out on top every time, but only by a small margin over the i686 system. While the remainder of the systems were very close, GZIp showed the greatest improvement where the Core2 system had a whopping 40% performance increase over the i486 system. Once again, the most notable improvement came with the jump from i486 to i686. Ubuntu performed the worst in each test, closely matching the i486 system.
Interestingly, there was essentially no benefit at all when it came to cryptography. The non-PTS John The Ripper tests under Gentoo show may big an impact CPU instructions have on performance, where the i486 and i686 were not using any. The Pentium3 system showed a large jump, with Core2 jumping even higher, presumable thanks to SSE2 and SSE3 instruction sets. The Ubuntu system, although not compiled for a Pentium3 CPU, was still able to match the Pentium3 system in performance.
Between the Gentoo systems there was really no benefit shown with the more optimised systems. This may be in part related to the use of NVIDIA proprietary driver, rather than one built entirely from source. If you really need those 4 extra frames per second in World of Padman, perhaps it is all worthwhile…
The following series of tests is designed to show general performance of a Linux system. As such, the range of tests is quite varied. Some included in tests above have been removed to save duplication.
The Apache tests shows a 6% performance increase over i486, HMMer shows 22%, Threaded I/O shows a 20% increase and dcraw a 25% increase. Clearly, the biggest advantage is seen with the jump from i486 to an i686 system, although i486 did win out in the Dbench test. From there on, the benefits are marginal.
These results also provide little proof that the more optimized system will provide greater performance. Sure, the Core2 system won out on the Bwfirt test, but it was by less than a second. Some of the other benchmarks are a little over the place, with P3 performing worse sometimes and better in others.
Next: What’s It All Mean?