<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Blowing The Doors Off HPC Speed-up Numbers</title>
	<atom:link href="http://www.linux-mag.com/id/7821/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.linux-mag.com/id/7821/</link>
	<description>Open Source, Open Standards</description>
	<lastBuildDate>Sat, 05 Oct 2013 13:48:18 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
	<item>
		<title>By: chipwatson</title>
		<link>http://www.linux-mag.com/id/7821/#comment-8578</link>
		<dc:creator>chipwatson</dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.linux-mag.com/id/7821/#comment-8578</guid>
		<description>&lt;p&gt;A critical thing to remember is that even if you speed up 50% of an application by 100x, the application is only faster by 2x.
&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>A critical thing to remember is that even if you speed up 50% of an application by 100x, the application is only faster by 2x.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: cgorac</title>
		<link>http://www.linux-mag.com/id/7821/#comment-8579</link>
		<dc:creator>cgorac</dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.linux-mag.com/id/7821/#comment-8579</guid>
		<description>&lt;p&gt;There is a discussion on this paper on CUDA forums: http://forums.nvidia.com/index.php?showtopic=172350.  I\&#039;d maybe add following bit: if you look at CUDA zone, you\&#039;ll see that most of 3-orders-of-magnitude reported speed-ups are from various academic papers, very rarely is such speed-up reported for a commercial application.  Which kind of confirm my belief that most of papers written these days (and certainly not only in the HPC area, but all domains of science) is utter crap.  However, NVIDIA accepted this kind of reports without any further checks, and kind of used it as promotional material, so they are certainly more than deserving this debunking from Intel guys.  That said, I will also note that I would choose to implement an algorithm on GPU, rather than on CPU, any day: modern CPUs, and particularly these architected by Intel/AMD are real pain to program for, and even more for the optimization, and performance measurements - these things are just darn over-complicated.  On the other side, GPUs have nice and understandable programming model (don\&#039;t have to mess with data in multiple of fours is relief in itself, and this is just beginning of the story), and I\&#039;d say it\&#039;s much easier to come up with an optimized version of some code for the GPU than for the CPU.  And this is also a factor of great importance in the comparison, it\&#039;s not only about speed-ups and nothing else...
&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>There is a discussion on this paper on CUDA forums: <a href="http://forums.nvidia.com/index.php?showtopic=172350" rel="nofollow">http://forums.nvidia.com/index.php?showtopic=172350</a>.  I\&#8217;d maybe add following bit: if you look at CUDA zone, you\&#8217;ll see that most of 3-orders-of-magnitude reported speed-ups are from various academic papers, very rarely is such speed-up reported for a commercial application.  Which kind of confirm my belief that most of papers written these days (and certainly not only in the HPC area, but all domains of science) is utter crap.  However, NVIDIA accepted this kind of reports without any further checks, and kind of used it as promotional material, so they are certainly more than deserving this debunking from Intel guys.  That said, I will also note that I would choose to implement an algorithm on GPU, rather than on CPU, any day: modern CPUs, and particularly these architected by Intel/AMD are real pain to program for, and even more for the optimization, and performance measurements &#8211; these things are just darn over-complicated.  On the other side, GPUs have nice and understandable programming model (don\&#8217;t have to mess with data in multiple of fours is relief in itself, and this is just beginning of the story), and I\&#8217;d say it\&#8217;s much easier to come up with an optimized version of some code for the GPU than for the CPU.  And this is also a factor of great importance in the comparison, it\&#8217;s not only about speed-ups and nothing else&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jhearns</title>
		<link>http://www.linux-mag.com/id/7821/#comment-8580</link>
		<dc:creator>jhearns</dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.linux-mag.com/id/7821/#comment-8580</guid>
		<description>&lt;p&gt;Blow the doors off?&lt;br /&gt;
Doug, I am shocked at your lack of knowledge of classic British films.&lt;br /&gt;
In the Italian Job the robbers practice blowing open an armoured car somewhere in a quarry in England. The vehicle ends up as a smoking wreck - cue Michael Caine saying \&quot;You\&#039;re only supposed to blow the bloody doors off\&quot;&lt;/p&gt;
&lt;p&gt;http://www.imdb.com/title/tt0064505/quotes&lt;/p&gt;
&lt;p&gt;We shall draw a quiet veil over the Hollywood remake.
&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>Blow the doors off?<br />
Doug, I am shocked at your lack of knowledge of classic British films.<br />
In the Italian Job the robbers practice blowing open an armoured car somewhere in a quarry in England. The vehicle ends up as a smoking wreck &#8211; cue Michael Caine saying \&#8221;You\&#8217;re only supposed to blow the bloody doors off\&#8221;</p>
<p><a href="http://www.imdb.com/title/tt0064505/quotes" rel="nofollow">http://www.imdb.com/title/tt0064505/quotes</a></p>
<p>We shall draw a quiet veil over the Hollywood remake.</p>
]]></content:encoded>
	</item>
</channel>
</rss>