Gentoo Optimizations Benchmarked – Part 2

Gentoo is a source based distribution which lets the user decide how to optimize their system in many ways and includes building for a specific CPU architecture. Linux Magazine benchmarks four such options; i486, i686, pentium3, core2, and throws in Ubuntu for good measure.

Testing method

The benchmarking system used is version 2.4.1 of Phoronix Test Suite (PTS), which is widely regarded as the most complete benchmarking tool for Unix systems. The tests themselves are broken down into categories, of which we use compression, cryptography, linux-system, mesa and x264. Low level hardware tests such as memory and disk benchmarking have been excluded.

There is one main issue with using PTS for this type of benchmark, we don’t want an even playing field. PTS builds the packages for most of its tests from source, so that each system will be running the same binaries. For our benchmarking, we want to have an uneven test – that is to say that we need each binary to be built differently in order to reflect each GCC optimization. Not only that, but we would like to build support for CPU instructions such as SSE.

To this end, there will be two sets of benchmarks used in this article. The first is PTS, where each package has been built with the specific -march CPU setting. The second will be, where possible, the Gentoo system binaries themselves, where not only have they been built with the appropriate -march setting, but also the respective USE flags for CPU instruction sets such as MMX and SSE. The Ubuntu tests system will run against the binaries from its default repositories. Hopefully between these two methods we can gain the most accurate comparison possible.

Whether an application uses CPU instructions or not is not restricted by the setting of GCC options. An application itself can still employ the use of SSE if it was programmed to, even though it was not built with the -msse GCC flag. This means that an i486 compiled binary can still use SSE instructions to improve performance. As such, we can expect some of the results to be very close. For example, the LAME MP3 encoder checks to see what instructions a CPU has and will use them whether built for an i486 or Core2 system. Due to Gentoo’s USE flags, we are able to disable SSE support checking in some of the applications (those which allow us to). In addition, all binaries built in the Gentoo system have NVIDIA’s VDPAU disabled.

Finally, we have tried to replicate the PTS tests for the Gentoo system tests, however this was not always possible. Each set of tests should be compared only with itself.

Let the games begin!

Audio Visual

Encoding Ogg Vorbis

ogg.png

Encoding MP3

mp3.png

Encoding FLAC

flac.png

Encoding WavPack

wavpack.png

Encoding FFmpeg

ffmpeg.png

Encoding H.264

x264.png

Here we see a slight performance increase when encoding with a more optimized system, however the margin is a lot smaller than one might have expected. Ogg encoding on the Pentium3 system yielded a 12% speed increase over the i486 system (in this test it was 2.5 seconds). Similarly, MP3 saw around 10% performance improvement over the i486 system.

A number of tests showed the largest jump from i486 to i686, with the exception of Ogg, where encoding time actually increased dramatically with the Core2 system. This test was re-compiled and run several times, which confirmed this result.

Generally, in order to reach full encoding potential the program itself must make specific use of additional instruction sets such as MMX and SSE. This is very apparent in the non-PTS FFmpeg Gentoo system test, where the i486 and i686 systems were not making use of these instructions.

Compression

7-Zip Compression

7zip.png

Gzip Compression

gzip.png

BZIP2 Compression

bzip2.png

LZMA Compression

lzma.png

This set of tests shows that Core2 comes out on top every time, but only by a small margin over the i686 system. While the remainder of the systems were very close, GZIp showed the greatest improvement where the Core2 system had a whopping 40% performance increase over the i486 system. Once again, the most notable improvement came with the jump from i486 to i686. Ubuntu performed the worst in each test, closely matching the i486 system.

Cryptography

OpenSSL

openssl.png

John The Ripper - MD5

john-md5.png

John The Ripper - DES

john-des.png

John The Ripper - Blowfish

john-blowfish.png

Interestingly, there was essentially no benefit at all when it came to cryptography. The non-PTS John The Ripper tests under Gentoo show may big an impact CPU instructions have on performance, where the i486 and i686 were not using any. The Pentium3 system showed a large jump, with Core2 jumping even higher, presumable thanks to SSE2 and SSE3 instruction sets. The Ubuntu system, although not compiled for a Pentium3 CPU, was still able to match the Pentium3 system in performance.

Gaming

OpenArena

Urban Terror

pts-padman.png

Nexuiz

Between the Gentoo systems there was really no benefit shown with the more optimised systems. This may be in part related to the use of NVIDIA proprietary driver, rather than one built entirely from source. If you really need those 4 extra frames per second in World of Padman, perhaps it is all worthwhile…

Linux System

The following series of tests is designed to show general performance of a Linux system. As such, the range of tests is quite varied. Some included in tests above have been removed to save duplication.

Apache

Bwfirt

pts-cray.png

pts-tachyon.png

pts-povray.png

pts-maaft.png

pts-hmmer.png

pts-threaded-read.png

pts-thread-write.png

pts-posmark.png

pts-dbench.png

Sodokut

pts-openfmm.png

pts-dcraw.png

Minion Solitaire

pts-pybench.png

The Apache tests shows a 6% performance increase over i486, HMMer shows 22%, Threaded I/O shows a 20% increase and dcraw a 25% increase. Clearly, the biggest advantage is seen with the jump from i486 to an i686 system, although i486 did win out in the Dbench test. From there on, the benefits are marginal.

These results also provide little proof that the more optimized system will provide greater performance. Sure, the Core2 system won out on the Bwfirt test, but it was by less than a second. Some of the other benchmarks are a little over the place, with P3 performing worse sometimes and better in others.

Next: What’s It All Mean?

Comments on "Gentoo Optimizations Benchmarked – Part 2"

uduogah

Great article. Being a gentoo and ubuntu lover, the article certainly provides some insights worth bearing in mind. It\’s also interesting to see those hardware specs are hackintosh-friendly! I\’m pretty sure those components lead a double life!!

Reply
hrudy

Great article. I\’ve always wondered what the performance benefits , if any, you could gain from a source based distribution like Gentoo.
Not to wish any work on anyone, it would be interesting to see what the advantages would be of using the Intel compiler. Intel has a long history with this compiler, I believe that it even predates gcc.

Reply
ewildgoose

I\’ve always wondered what the performance benefits , if any, you could gain from a source based distribution like Gentoo.

The answer being that this is the wrong question…

In fact my guess would be that the fastest binaries would be generated by carefully benchmarking each bit of software to determine the best compile options and then distributing the compiled binaries…

..However, that gets us back to a \”normal\” binary distribution again (rpm/deb, etc), plus all the associated dependency hell that goes with it

In my opinion (gentoo on a dozen servers), the point of a source based distro is good control over dependencies and ability to stay as bleading edge or stable as you need to. So I can have one server running an antique glibc with a cutting edge mysql and another server running a cutting edge glibc with some old version of nginx (or whatever)

Additionally it\’s very easy to update a bunch of servers over long periods of time without suffering the downtime of a full \”upgrade\”. Show me a Centos system installed 4 years ago and yet still running bang up to date glibc/mysql/apache, etc.

Finally, given a build template to create SomeSoftware-V1, it\’s usually fairly trivial to bump the template to create SomeSoftware-V1.1 and hence usually source distros can help you stay up to date with bleeding edge more easily. Now someone will shout that this is inappropriate for servers, but even there sometimes you want to track some new and fast moving utility, eg my MySql servers are pinned to some known version, but I track bleeding edge Maatkit mysql tools (since they are in fast development mode right now)

More related to Gentoo, and less to source distro\’s in general, but gentoo makes it very easy to have machine \”profiles\” which mandate software versions, optional features (USE flags), compile options and even base packages which must/must not be installed. So for example I have a web server template which requires a modern gcc to be used, hardened compiler flags to be used, certain versions of nginx to be installed, again with various default modules that nginx should support. So in many ways something like Kickstart for Redhat… But the cool thing is I can update the template right now and then all my machines pull in the changes and rebuild whatever is required to get to the end result!

The penalty of source based distros is package generation time (essentially gentoo still needs a binary package, it just builds it on demand instead of it being pre-built). However, for my needs I mitigate this by keeping all my servers \”similar\” and then my package repository is automatically re-used after the first machine updates.

So my usual update procedure is to copy one of my virtual servers, test the upgrade in the copy, if all goes well then run the upgrades on the live server(s) (takes only seconds per package).

Gentoo is not likely suitable for the majority audience, but if you have strong admin skills then it\’s a great fit and will allow you to very easily run lots of servers with variable configs, all very easily

Reply
csmart

Well said, ewildgoose. Source based distros are really about flexibility and control. You can decide what the system will be, down to the very core libraries, \”optimizations\” and make it your own. Binary distros make a whole bunch of decisions for you, including basic things such as features, dependencies, even configuring applications and daemons.

Gentoo provides a flexible framework for you to do whatever it is you want, yourself.

-c

Reply
ewildgoose

For example I just updated one of my linux-vservers from:
- gcc 3.4 -> 4.3
- glibc 2.9 -> 2.10
- mysql jumped from some older release to 5.0.84
- nginx jumped from 0.6 to 0.7

No particular issues to note from upgrading…

On the other hand I have some CentOs box running my phone system and it drives me nuts that I remain pinned to antique versions of stuff I would like to upgrade, but I either need to ditch the centos packages and roll my own, or some other equally painful path

However, Gentoo likely does NOT suit the average punter who does not need that level of control. There is quite a significant complexity over head to get to grips with. So it comes down to the old adage of choosing the best tool for the job…

Reply
lescoke

I\’m not certain what this article was trying to prove. Testing code compiled for older processors on a newer processor just shows how design changes in newer processors support using older instructions.

If I build an embedded system using a 486 class processor, I would want everything compiled for that processor. The same holds true that if I build a newer system, I would want everything compiled to take advantage of new specialized instructions.

Binary packages can only be compiled to take advantage of a common subset of features found on the oldest supported processor for a given build. Hence some distro\’s have separate builds for x86, I386, I686,…

Building from source does take time, but with the proper compile options selected, you can make better use of CPU features. The gains may be small, but they can help. Gentoo does offer the ability to install using binaries instead of source if desired.

It may be baptism by fire, but anyone who really wants to understand how everything works in Linux needs to build at least one system from scratch using a distro like Gentoo.

Reply
robng15

I would be interested to see the effect that using \”-march=native -ftree-vectorize\” has on the benchmarks, as this would more properly allow gcc to make full use of the particular processor, which is the definite benefit of using Gentoo or other source based distributions.

Most certainly makes a difference on 64bit code, not so sure on the effect on 32bit code.

It is also possible to specify per-package CFLAGS/CXXFLAGS which allows tuning of individual packages if there is anything that requires maximum performance, rather than just the generic system CFLAGS/CXXFLAGS.

Reply
golding

>>It is also possible to specify per-package CFLAGS/CXXFLAGS which allows tuning of individual packages<<
Who will do this for the 15 hundred or so individual packages on an average system? I have noticed that while some ebuilds do specify CFLAGS/CXXFLAGS, most don\’t.

I\’ve been a Gentoo\’st for around 6 to 7 years and have been best served by using the minimal flags, just \”-march=(arch) O2 -pipe\”.

However, USE flags are another matter. I think these make large differences in the system build, and while the USE flags in my make.conf are very few, I do have a very large packages.use file. Nearly a rule for every package on my system.

Regards, Rob

Reply
robng15

Couldn\’t agree more golding, hence the words \”possible to\”… I personally run with \”-march=native -O2 -ftree-vectorize -pipe\”… or at least have done so since gcc-4.4.0 (now on gcc-4.4.3)… previously had no \”-ftree-vectorize\”…

If you are going to do a lot of video/audio work, then I would be tempted to look at per=package optimisations for some of the video/audio codecs, indeed I am experimenting with hugin and it\’s dependencies, mainly just for the sake of curiosity, using some of the new graphite flags. BUT I wouldn\’t use them for the whole system.

USE flags are amazing, and indeed, the main reason for using gentoo. I think I probably took the opposite approach to yourself, as I have a large number, mainly video/audio/image flags in my /etc/make.conf, but I do have some overrides in /etc/portage/packages.use.

Rob.

Reply
korin43

You should try this test with an ATI video card and robng15\’s suggestions. It would be very interesting.

Reply

I’d be interested to see this comparison with AMD. I suspect that AMD’s contributions to the GCC compiler may result in more substantial benefits on AMD architectures. Intel, I suspect, focused more on their Intel compiler.

Reply

Mobile phones have undergone so many facilities and the new generation cell phones have almost all functionalities of a personal computer. The high end mobile phones with advanced features are known as Smart phones. They are highly efficient in performing multiple functions and is a combination of gadgets rolled into one namely a camera, computer, calendar, TV etc.

But if you want to buy this or any other smartphones,you should compare the popular smartphones and choose the one which suits your needs best.I found an article on 10 Best Smartphone Reviews at :- http://socialeum.com/10-best-smartphone-reviews-for-2014.html

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>