<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: I Have (HPC) Issues</title>
	<atom:link href="http://www.linux-mag.com/id/4110/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.linux-mag.com/id/4110/</link>
	<description>Open Source, Open Standards</description>
	<lastBuildDate>Sat, 05 Oct 2013 13:48:18 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
	<item>
		<title>By: clusterman</title>
		<link>http://www.linux-mag.com/id/4110/#comment-4585</link>
		<dc:creator>clusterman</dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.linux-mag.com/id/4110/#comment-4585</guid>
		<description>I have issues with vendors.  One of the biggest challenges we face is knowing when and how to update the firmware on all the cluster components.  It seems that vendors don&#039;t seem to understand how to do this either.  For example, you call in a hardware issue, the help desk tells you that you need to update firmware.  We tell them we have 500 nodes all running the same revision and they are not having the issue.  The help desk says they should all be updated.  So there in lies the next problem.  The helpdesk tells you to download this ISO and boot your server off it.  Hello!  We have 500 of these.  We actually had to do this manually one time because the firmware on our hard drives was bad and the only way to update them was to boot a floppy and perform a dos level update.  Vendors need to develop automated tools for updating clusters with one command.  The grid is designed so that you can submit in one place and a job goes off and does work at all the other locations.  Why don&#039;t the vendors get this?  They want to sell you a cluster but they want to perform break/fix on a per system level.  Very frustrating.  We actually had the vendor professional services team out here to help us build our last cluster.  We could not believe our eyes when the Engineer proceeded to to one system at time.  It was very painful.  We&#039;ve since home grown a lot of our own tools but it is just ridiculous that the vendors aren&#039;t providing this functionality.  Does anyone else have this issue?  Maybe its just the vendor we are working with?</description>
		<content:encoded><![CDATA[<p>I have issues with vendors.  One of the biggest challenges we face is knowing when and how to update the firmware on all the cluster components.  It seems that vendors don&#8217;t seem to understand how to do this either.  For example, you call in a hardware issue, the help desk tells you that you need to update firmware.  We tell them we have 500 nodes all running the same revision and they are not having the issue.  The help desk says they should all be updated.  So there in lies the next problem.  The helpdesk tells you to download this ISO and boot your server off it.  Hello!  We have 500 of these.  We actually had to do this manually one time because the firmware on our hard drives was bad and the only way to update them was to boot a floppy and perform a dos level update.  Vendors need to develop automated tools for updating clusters with one command.  The grid is designed so that you can submit in one place and a job goes off and does work at all the other locations.  Why don&#8217;t the vendors get this?  They want to sell you a cluster but they want to perform break/fix on a per system level.  Very frustrating.  We actually had the vendor professional services team out here to help us build our last cluster.  We could not believe our eyes when the Engineer proceeded to to one system at time.  It was very painful.  We&#8217;ve since home grown a lot of our own tools but it is just ridiculous that the vendors aren&#8217;t providing this functionality.  Does anyone else have this issue?  Maybe its just the vendor we are working with?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: mdomsch</title>
		<link>http://www.linux-mag.com/id/4110/#comment-4586</link>
		<dc:creator>mdomsch</dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.linux-mag.com/id/4110/#comment-4586</guid>
		<description>Dell has a set of open source tools called &#039;firmware-tools&#039; which aims to solve this generically.  http://linux.dell.com/firmware-tools/ is the home page.&lt;br /&gt;
&lt;br /&gt;
Thanks,&lt;br /&gt;
Matt Domsch&lt;br /&gt;
--&lt;br /&gt;
Linux Technology Strategist, Dell Office of the CTO&lt;br /&gt;
linux.dell.com &amp; www.dell.com/linux</description>
		<content:encoded><![CDATA[<p>Dell has a set of open source tools called &#8216;firmware-tools&#8217; which aims to solve this generically.  <a href="http://linux.dell.com/firmware-tools/" rel="nofollow">http://linux.dell.com/firmware-tools/</a> is the home page.</p>
<p>Thanks,<br />
Matt Domsch<br />
&#8211;<br />
Linux Technology Strategist, Dell Office of the CTO<br />
linux.dell.com &amp; <a href="http://www.dell.com/linux" rel="nofollow">http://www.dell.com/linux</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: kavalerov</title>
		<link>http://www.linux-mag.com/id/4110/#comment-4587</link>
		<dc:creator>kavalerov</dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.linux-mag.com/id/4110/#comment-4587</guid>
		<description>A lot of it is self-inflicted. The DYI approach is wrong. Why not pay a vendor to do the low-level work for you? Can not afford? that&#039;s the problem that needs to be solved.&lt;br /&gt;
&lt;br /&gt;
Example: firmware updates on live system. Easy to do on IBM pSeries, always has been, and least since AIX 4.3.3. Can also do it remotely, on any number of nodes. No need to reboot, or even degrade your run level. It does it for your service processor, your hard drives, storage adapters, anything.</description>
		<content:encoded><![CDATA[<p>A lot of it is self-inflicted. The DYI approach is wrong. Why not pay a vendor to do the low-level work for you? Can not afford? that&#8217;s the problem that needs to be solved.</p>
<p>Example: firmware updates on live system. Easy to do on IBM pSeries, always has been, and least since AIX 4.3.3. Can also do it remotely, on any number of nodes. No need to reboot, or even degrade your run level. It does it for your service processor, your hard drives, storage adapters, anything.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: nospamou</title>
		<link>http://www.linux-mag.com/id/4110/#comment-4588</link>
		<dc:creator>nospamou</dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.linux-mag.com/id/4110/#comment-4588</guid>
		<description>To Matt Domsch from Dell...I had the exact same issue as in the article. I had to upgrade the drive  firmware on 500 servers, and they were Dell. I contacted Dell technical support and they didn&#039;t have any work around for automating this. Something for Dell to work on as everything else I&#039;ve had to upgrade has worked like a charm. Maybe when the VMWare hypervisor is built into the motherboard system upgrades will become alot easier and won&#039;t require a reboot. We&#039;ll see.</description>
		<content:encoded><![CDATA[<p>To Matt Domsch from Dell&#8230;I had the exact same issue as in the article. I had to upgrade the drive  firmware on 500 servers, and they were Dell. I contacted Dell technical support and they didn&#8217;t have any work around for automating this. Something for Dell to work on as everything else I&#8217;ve had to upgrade has worked like a charm. Maybe when the VMWare hypervisor is built into the motherboard system upgrades will become alot easier and won&#8217;t require a reboot. We&#8217;ll see.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: tuccillo</title>
		<link>http://www.linux-mag.com/id/4110/#comment-4589</link>
		<dc:creator>tuccillo</dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.linux-mag.com/id/4110/#comment-4589</guid>
		<description>I am not sure what you mean when you say &quot;Programming and optimizing for this new model is now much harder. &quot; (refering to SMP nodes in a cluster). I have been using SMP nodes in clusters for 10 years and never had any issues. Typically, I assign as many MPI tasks to a node as cores. This is really a function performed by a batch scheduler. I also started developing the Hybrid approach (MPI/OpenMP) many years ago for those instances where combining MPI and OpenMP provided a performance advantage over pure MPI (using the same number of core). I have seen very few examples, however, where this is true. Most of the time, however, I assign as many MPI tasks to a node as core. The message passing model doesnt care where the processes are located from a functionality point of view. In a very few instances there was a performance impact but arbitrary assignment of MPI tasks to arbitrary nodes is a feature of some batch schedulers. For the vast majority of MPI programs I have seen, the number of cores on a node is just not an issue.</description>
		<content:encoded><![CDATA[<p>I am not sure what you mean when you say &#8220;Programming and optimizing for this new model is now much harder. &#8221; (refering to SMP nodes in a cluster). I have been using SMP nodes in clusters for 10 years and never had any issues. Typically, I assign as many MPI tasks to a node as cores. This is really a function performed by a batch scheduler. I also started developing the Hybrid approach (MPI/OpenMP) many years ago for those instances where combining MPI and OpenMP provided a performance advantage over pure MPI (using the same number of core). I have seen very few examples, however, where this is true. Most of the time, however, I assign as many MPI tasks to a node as core. The message passing model doesnt care where the processes are located from a functionality point of view. In a very few instances there was a performance impact but arbitrary assignment of MPI tasks to arbitrary nodes is a feature of some batch schedulers. For the vast majority of MPI programs I have seen, the number of cores on a node is just not an issue.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jmcculloch</title>
		<link>http://www.linux-mag.com/id/4110/#comment-4590</link>
		<dc:creator>jmcculloch</dc:creator>
		<pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate>
		<guid isPermaLink="false">http://www.linux-mag.com/id/4110/#comment-4590</guid>
		<description>Find another vendor.  We employ a PXE-bootable DOS image to apply firmware updates and CMOS configuration changes on a large scale.&lt;br /&gt;
&lt;br /&gt;
Another option is to use a package like IBM Director with Remote Deployment Manager add-on.  It works with most server vendors.&lt;br /&gt;
&lt;br /&gt;
Furthermore, if your cluster management package supports parallel shell, often you can copy the firmware update package locally to each system and execute at the command prompt.  This usually requires rebooting the cluster to apply the update.&lt;br /&gt;
&lt;br /&gt;
Regards,&lt;br /&gt;
John McCulloch&lt;br /&gt;
PCPC Direct Ltd.&lt;br /&gt;
http://www.pcpcdirect.com</description>
		<content:encoded><![CDATA[<p>Find another vendor.  We employ a PXE-bootable DOS image to apply firmware updates and CMOS configuration changes on a large scale.</p>
<p>Another option is to use a package like IBM Director with Remote Deployment Manager add-on.  It works with most server vendors.</p>
<p>Furthermore, if your cluster management package supports parallel shell, often you can copy the firmware update package locally to each system and execute at the command prompt.  This usually requires rebooting the cluster to apply the update.</p>
<p>Regards,<br />
John McCulloch<br />
PCPC Direct Ltd.<br />
<a href="http://www.pcpcdirect.com" rel="nofollow">http://www.pcpcdirect.com</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>