dcsimg

GP-GPUs: OpenCL Is Ready For The Heavy Lifting

nVidia CUDA may be the rage, but OpenCL is a standard that has some features you may need.

In a previous column, I bemoaned the state of HPC Software. This column was actually a prelude to my column on nVidia CUDA computing. I was particularly impressed at how fast CUDA has gained traction in HPC and other areas. The CUDA wave has definitely hit the beach and I’ll have more on nVidia as the Fermi GPU begins to filter into the HPC trenches. In this column I want to talk about the other GPU language: OpenCL.

Before I launch into OpenCL background, I want make a prediction. I believe OpenCL will gain acceptance in much the same way nVidia CUDA has. Like CUDA, OpenCL has a freely available SDK (Software Development Kit), is based on the C language, and can be explored using low cost video hardware. OpenCL brings two other features to the table, however. These are open standard compliance and support for data-parallelism (GP-GPU) and task-parallelism (CPU) methods. I’ll take a closer look at these below, but first some background will be helpful.

Currently the GP-GPU competition is between AMD/ATI and nVidia. HPC at IBM was making some inroads with Cell, but has decided to switch to an OpenCL platform and presumably use AMD/ATI hardware. In addition, the much discussed Intel Larrabee never really made it out of the gate and now we are left with two main contenders, both of which have a strong desktop market to support development and production costs.

Historically, AMD/ATI supported the BrookGPU model and included an enhanced version in their SDK as Brook+. While Brook+ allowed for GP-GPU computing, the entire industry realized that some form of standard was required. Thus in June 2008, The Khronos Group and a rather impressive group of companies launched the OpenCL Working Group in an effort to create a standard for GPU/CPU programming (The Khronos Group is a member-funded consortium focused on the creation of royalty-free open standards. In addition to OpenCL, they also maintain the OpenCL graphics standard). In part due to the contributions from Apple Computer, the OpenCL 1.0 standard was ratified in December of 2008 just 6 months after the Working Group was formed. The list of participating organizations includes 3DLABS, Activision Blizzard, AMD, Apple, ARM, Broadcom, Codeplay, Electronic Arts, Ericsson, Freescale, Fujitsu, GE, Graphic Remedy, HI, IBM, Intel, Imagination Technologies, Los Alamos National Laboratory, Motorola, Movidia, Nokia, NVIDIA, Petapath, QNX, Qualcomm, RapidMind, Samsung, Seaweed, S3, ST Microelectronics, Takumi, Texas Instruments, Toshiba and Vivante. In other words, this is a serious effort.

As a supporter of OpenCL, AMD has recently released the ATI Stream SDK v2.01 for both Linux (RHEL 5.3, Ubuntu 9.10, openSUSE 11.0) and Windows (XP, Vista, and 7). In terms of Linux, the software team at AMD/ATI have made some efforts to integrate OpenCL with the current open tool chain. For example, it is now possible to use gdb to debug OpenCL kernels (In OpenCL a kernel is the basic unit of executable code and can be thought of as a C function that runs on the GPU or a multi-core CPU.) There is also a Stream KernelAnalyzer that is currently available only for Windows. As with CUDA, OpenCL can be used on existing AMD/ATI video cards, but it is always good to check the system requirements to be sure.

At this point, you may be wondering “What about CUDA and nVidia?” If you read the list of companies involved with the OpenCL specification you should also note that nVidia is part of the Working Group. nVidia has been very vocal about the their support any programming language that allows you to program their GPUs and they offer their own version of OpenCL (for their hardware).

As stated, the fact that OpenCL is standard weighs heavily in determining its future. Having the support of pretty much the entire computer/video hardware industry helps a bit as well. From an ISV (Independent Software Developer) standpoint, OpenCL is is the gateway to hybrid (CPU/GPU) computing. As anyone with scar tissue in the HPC industry can tell you, investing resources and time into non-standard APIs (Applications Programing Interfaces) is a risky business. MPI was developed for similar reasons (i.e. programmers did not want to recode every time a new parallel computer architecture hit the server room).

One final feature of OpenCL should not be overlooked. As mentioned, OpenCL supports data-parallelism and task-parallelism. In the hybrid computing world, there is currently an implied assumption that the GPU is a slave to the CPU, that is the GPU cannot run on its own as it must have a CPU present. Given this assumption, one should be able to write OpenCL programs that can adapt to the hardware environment and run minimally on a single CPU (core). Of course it will run slower, but it will still run. If more cores or GPUs are found in a different hardware environment, then an OpenCL program should be able to adapt to the new hardware at run-time. The rather distasteful alternative is separate binaries for various combinations of CPU and GPU resources.

Everyone is pretty much convinced at this point that hybrid computing is going to play big in HPC. Like all things software, development tools are constantly behind the hardware advances. While OpenCL does not address off-node computation like MPI, it does provide a standard method to move forward with hybrid computing. The other good thing about it is you can always grab a cheap video card and free OpenCL SDK to see if it works for your codes. Like any new software model, your biggest invest is time and a few hundred cups of coffee.

Comments on "GP-GPUs: OpenCL Is Ready For The Heavy Lifting"

cgorac

There is a small typo in the article, in explaining the role of the Khronos Group: \”In addition to OpenCL, they also maintain the OpenCL graphics standard\” – the latter should be \”OpenGL\”. And exactly their current record in governing the OpenGL standard is not particularly convincing, so I don\’t expect them to be much more successful with OpenCL. Besides, anyone that actually tried to use OpenCL to target multiple platform (I did) could confess that all of its promise is just a myth at the moment – you just have to code for each platform specifics if you want to take the most of the performance out of it. OpenCL also, out of all these WG members, doesn\’t have much support: NVIDIA is simply pushing CUDA as vastly more mature platform, ATI/AMD is actually looking still unconvinced that there is market for GPGPU work, then there exist large number of additional competing low- (like Microsoft DirectCompute, etc.) or high-level solutions (stuff like upcoming Intel Ct, etc.); so Apple seems like the only true supporter of OpenCL, but their market is very small. Thus, I don\’t think OpenCL perspective is bright at all.

Reply
matador

I would like to add 4 points:

1- A simple tool for porting CUDA to OpenCL:

Swan is a small tool that aids the reversible conversion of existing CUDA codebases to OpenCL. Its main features are the translation of CUDA kernel source-code to OpenCL, and a common API that abstracts both CUDA and OpenCL runtimes. Swan preserves the convenience of the CUDA <<< grid, block >>> kernel launch syntax by generating C source-code for kernel entry-point functions. Possible uses include:

* Evaluating OpenCL performance of an existing CUDA code
* Maintaining a dual-target OpenCL and CUDA code
* Reducing dependence on NVCC when compiling host code
* Support multiple CUDA compute capabilities in a single binary

Swan is developed by the MultiscaleLab, Barcelona, and is available under the GPL2 license.
http://www.multiscalelab.org/swan

2- CLyther = Python + OpenCL:

CLyther is an under-development python tool for OpenCL similar to Cython for C. CLyther is a python language extension intended to make writing OpenCL code as easy as Python itself. CLyther currently only supports a subset of the Python language definition but adds many new features for OpenCL.
CLyther exposes both the OpenCL C library and language to python. It’s features include:
• Fast prototyping of OpenCL code.
• OpenCL kernel function creation using the Python language definition.
• Strong OOP programming in OpenCL code.
• Passing functions as arguments to kernel functions.
• Python emulation mode for OpenCL code.
• Fancy indexing of arrays.
• Dynamic compilation at runtime.
http://clyther.sourceforge.net/

3- PyOpenCL:
PyOpenCL lets you access the OpenCL parallel computation API from Python. Here\’s what sets PyOpenCL apart:
• Object cleanup tied to lifetime of objects. This idiom, often called RAII in C++, makes it much easier to write correct, leak- and crash-free code.
• Completeness. PyOpenCL puts the full power of OpenCL’s API at your disposal, if you wish.
• Convenience. While PyOpenCL\’s primary focus is to make all of OpenCL accessible, it tries hard to make your life less complicated as it does so–without taking any shortcuts.
• Automatic Error Checking. All OpenCL errors are automatically translated into Python exceptions.
• Speed. PyOpenCL’s base layer is written in C++, so all the niceties above are virtually free.
• Helpful, complete documentation and a wiki.
• Liberal licensing (MIT).
http://mathema.tician.de/software/pyopencl

4- LuxRender:
LuxRender is a physically based and unbiased rendering engine. Based on state of the art algorithms, LuxRender simulates the flow of light according to physical equations, thus producing realistic images of photographic quality. LuxRender is free software – both for personal and commercial use – and is licensed under the GPL.
You will find amazing steps and results done to introduce OpenCL support in Luxrender:
http://www.luxrender.net/wiki/index.php?title=Luxrender_and_OpenCL

thanks>

Reply
znmeb

I have a workstation with an NVidia GeForce 6150SE nForce 430 and a laptop with an ATI Radeon Mobility 3200. Unfortunately, the OpenCL SDK doesn\’t appear to support the 3200! That\’s a *huge* fail for AMD/ATI as far as I\’m concerned. The laptop runs openSUSE 11.2 just fine with the ATI drivers – why can\’t it run the OpenCL SDK too?

Reply

Major thanks for the blog. Awesome.

Reply

Very great post. I simply stumbled upon your weblog and wanted to mention that I’ve truly loved surfing around your weblog posts. In any case I will be subscribing to your feed and I’m hoping you write again very soon!

Reply

I have been exploring for a bit for any high-quality articles or blog posts on this kind of area . Exploring in Yahoo I at last stumbled upon this web site. Reading this information So i’m happy to convey that I’ve an incredibly good uncanny feeling I discovered just what I needed. I most certainly will make certain to do not forget this web site and give it a glance on a constant basis.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>