<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: ITIC 2009 Global Server Hardware &amp; Server OS Reliability Survey Results</title>
	<atom:link href="http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/feed/" rel="self" type="application/rss+xml" />
	<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/</link>
	<description>The Time for Business is Now</description>
	<lastBuildDate>Thu, 05 Aug 2010 13:05:53 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: Laura DiDio</title>
		<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/comment-page-1/#comment-171</link>
		<dc:creator>Laura DiDio</dc:creator>
		<pubDate>Wed, 11 Nov 2009 15:57:45 +0000</pubDate>
		<guid isPermaLink="false">http://itic-corp.com/?p=205#comment-171</guid>
		<description>Hi, Mike:

Thanks for asking and good timing. ITIC analysts are right now putting together our year-end surveys which will also track 2010 trends. The mainframe survey is among them. We expect to &quot;go live&quot; with the mainframe survey in the next six weeks and publicize results in January 2010. Please feel free to let us know if there are any particular questions you&#039;d like to see included. We&#039;ll do our best to see that ITIC addresses the most pressing and pertinent issues impacting mainframe enterprises.</description>
		<content:encoded><![CDATA[<p>Hi, Mike:</p>
<p>Thanks for asking and good timing. ITIC analysts are right now putting together our year-end surveys which will also track 2010 trends. The mainframe survey is among them. We expect to &#8220;go live&#8221; with the mainframe survey in the next six weeks and publicize results in January 2010. Please feel free to let us know if there are any particular questions you&#8217;d like to see included. We&#8217;ll do our best to see that ITIC addresses the most pressing and pertinent issues impacting mainframe enterprises.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: mike ferrell</title>
		<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/comment-page-1/#comment-170</link>
		<dc:creator>mike ferrell</dc:creator>
		<pubDate>Wed, 11 Nov 2009 14:57:44 +0000</pubDate>
		<guid isPermaLink="false">http://itic-corp.com/?p=205#comment-170</guid>
		<description>Hi Laura - this looks like a well-done and honest survey - anything new on the &quot;mainframe&quot; survey?</description>
		<content:encoded><![CDATA[<p>Hi Laura &#8211; this looks like a well-done and honest survey &#8211; anything new on the &#8220;mainframe&#8221; survey?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alvin</title>
		<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/comment-page-1/#comment-147</link>
		<dc:creator>Alvin</dc:creator>
		<pubDate>Mon, 21 Sep 2009 20:47:55 +0000</pubDate>
		<guid isPermaLink="false">http://itic-corp.com/?p=205#comment-147</guid>
		<description>One question please. 
For example with Unix Sun Solaris on SPARC server category. The annual average number of incidents for Tier 1/2/3 are 0.59/0.49/0.1. So the % of incidents for Tier 2 and 3 should be (0.49+0.1)/ (0.49+0.59+0.1) = 50%. Why the chart actually says the Tier2/3 combined accounts for only 25% of all incidents?</description>
		<content:encoded><![CDATA[<p>One question please.<br />
For example with Unix Sun Solaris on SPARC server category. The annual average number of incidents for Tier 1/2/3 are 0.59/0.49/0.1. So the % of incidents for Tier 2 and 3 should be (0.49+0.1)/ (0.49+0.59+0.1) = 50%. Why the chart actually says the Tier2/3 combined accounts for only 25% of all incidents?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Laura DiDio</title>
		<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/comment-page-1/#comment-145</link>
		<dc:creator>Laura DiDio</dc:creator>
		<pubDate>Mon, 14 Sep 2009 16:39:10 +0000</pubDate>
		<guid isPermaLink="false">http://itic-corp.com/?p=205#comment-145</guid>
		<description>Hello, Aaron:

I appreciate you taking the time to stop by and leave a comment on the ITIC reliability survey results.  In answer to your question, yes, the HP OpenVMS platform was considered. However, in order to include a platform in the final published results, we require 100+ responses, so that it is statistically valid. We did not have that many responses to the HP OpenVMS platform. However, the responses that we did get indicated that the platform was extremely reliable and scored very favorably when compared against the other top distributions. Overall, HP OpenVMS servers had less than 30 minutes of per server, per annum downtime. If you want more information, please send me a follow up Email with specific questions and I&#039;ll be happy to address them. 

Thanks again; your comments are appreciated.</description>
		<content:encoded><![CDATA[<p>Hello, Aaron:</p>
<p>I appreciate you taking the time to stop by and leave a comment on the ITIC reliability survey results.  In answer to your question, yes, the HP OpenVMS platform was considered. However, in order to include a platform in the final published results, we require 100+ responses, so that it is statistically valid. We did not have that many responses to the HP OpenVMS platform. However, the responses that we did get indicated that the platform was extremely reliable and scored very favorably when compared against the other top distributions. Overall, HP OpenVMS servers had less than 30 minutes of per server, per annum downtime. If you want more information, please send me a follow up Email with specific questions and I&#8217;ll be happy to address them. </p>
<p>Thanks again; your comments are appreciated.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Laura DiDio</title>
		<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/comment-page-1/#comment-144</link>
		<dc:creator>Laura DiDio</dc:creator>
		<pubDate>Mon, 14 Sep 2009 16:33:07 +0000</pubDate>
		<guid isPermaLink="false">http://itic-corp.com/?p=205#comment-144</guid>
		<description>Hi, Thomas:

Thanks for taking the time to stop by and leave a comment. The forthcoming ITIC survey on mainframe usage and reliability will definitely include the HP NonStop servers as well as Stratus servers. I concur with your assessment of the very high reliability of the HP platform. Everything I&#039;ve heard from corporate customers over the years echoes what you&#039;ve said, 99.999% is very common among HP and Stratus servers. Stay tuned for future research.</description>
		<content:encoded><![CDATA[<p>Hi, Thomas:</p>
<p>Thanks for taking the time to stop by and leave a comment. The forthcoming ITIC survey on mainframe usage and reliability will definitely include the HP NonStop servers as well as Stratus servers. I concur with your assessment of the very high reliability of the HP platform. Everything I&#8217;ve heard from corporate customers over the years echoes what you&#8217;ve said, 99.999% is very common among HP and Stratus servers. Stay tuned for future research.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Thomas Burg</title>
		<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/comment-page-1/#comment-143</link>
		<dc:creator>Thomas Burg</dc:creator>
		<pubDate>Mon, 14 Sep 2009 06:18:38 +0000</pubDate>
		<guid isPermaLink="false">http://itic-corp.com/?p=205#comment-143</guid>
		<description>Laura,

when you start looking at IBM mainframe you should also consider looking at HP NonStop servers (formerly known as Tandem systems). These are the most reliable systems in the world as far as I know, we are talking &quot;five nines&quot; or 99.99999 % uptime.</description>
		<content:encoded><![CDATA[<p>Laura,</p>
<p>when you start looking at IBM mainframe you should also consider looking at HP NonStop servers (formerly known as Tandem systems). These are the most reliable systems in the world as far as I know, we are talking &#8220;five nines&#8221; or 99.99999 % uptime.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aaron</title>
		<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/comment-page-1/#comment-142</link>
		<dc:creator>Aaron</dc:creator>
		<pubDate>Fri, 11 Sep 2009 21:30:39 +0000</pubDate>
		<guid isPermaLink="false">http://itic-corp.com/?p=205#comment-142</guid>
		<description>I looked at the graphics on your site&#039;s home page, but did not see any reference to the one platform that has traditionally been at the top of this list: HP&#039;s OpenVMS and VMSclusters.

Was it considered and just didn&#039;t show up on the charts?  Or was it not even considered?</description>
		<content:encoded><![CDATA[<p>I looked at the graphics on your site&#8217;s home page, but did not see any reference to the one platform that has traditionally been at the top of this list: HP&#8217;s OpenVMS and VMSclusters.</p>
<p>Was it considered and just didn&#8217;t show up on the charts?  Or was it not even considered?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Timothy</title>
		<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/comment-page-1/#comment-141</link>
		<dc:creator>Timothy</dc:creator>
		<pubDate>Wed, 09 Sep 2009 01:49:52 +0000</pubDate>
		<guid isPermaLink="false">http://itic-corp.com/?p=205#comment-141</guid>
		<description>Sounds good, Laura. I&#039;m looking forward to reading more.

It also occurs to me that &quot;Linux&quot; (or even &quot;Red Hat Linux&quot;) is not descriptive enough these days. There&#039;s &quot;Red Hat Enterprise Linux on X86 servers&quot; and &quot;Red Hat Linux on System z servers,&quot; to pick a couple examples. (Solaris X86 and Solaris SPARC is another likely differentiated pair that comes to mind.) Granted, you can cut the data as fine as you want, with patch levels and specific hardware models and configurations. At some point it&#039;s a judgment call. But it would be interesting in a few OS cases to see what impact hardware has (directly or indirectly). In particular, I think (if there are enough survey responses) it would be interesting to see a couple different flavors of Linux iron (X86 and z would be good) and Solaris broken out.

You raise another really good point about the economic situation possibly impairing availability reporting. I think that&#039;s quite likely. But I also see some anecdotal evidence that service qualities (including availability) become even more important in an economic downturn. I think the reason is Darwinian: &quot;survival of the fittest,&quot; basically. When producers/suppliers are more desperate for business, the few remaining customers can be more discriminating, more fickle. And there&#039;s also a general &quot;flight to quality,&quot; particularly for capital goods and high relationship services, because customers don&#039;t want to get stuck doing business with a company that&#039;s going out of business. All of that probably conspires to drive IT service quality requirements up, especially in areas that might attract unwelcome public attention and customer flight such as highly visible availability failures or security breaches.

Thanks for working on this.</description>
		<content:encoded><![CDATA[<p>Sounds good, Laura. I&#8217;m looking forward to reading more.</p>
<p>It also occurs to me that &#8220;Linux&#8221; (or even &#8220;Red Hat Linux&#8221;) is not descriptive enough these days. There&#8217;s &#8220;Red Hat Enterprise Linux on X86 servers&#8221; and &#8220;Red Hat Linux on System z servers,&#8221; to pick a couple examples. (Solaris X86 and Solaris SPARC is another likely differentiated pair that comes to mind.) Granted, you can cut the data as fine as you want, with patch levels and specific hardware models and configurations. At some point it&#8217;s a judgment call. But it would be interesting in a few OS cases to see what impact hardware has (directly or indirectly). In particular, I think (if there are enough survey responses) it would be interesting to see a couple different flavors of Linux iron (X86 and z would be good) and Solaris broken out.</p>
<p>You raise another really good point about the economic situation possibly impairing availability reporting. I think that&#8217;s quite likely. But I also see some anecdotal evidence that service qualities (including availability) become even more important in an economic downturn. I think the reason is Darwinian: &#8220;survival of the fittest,&#8221; basically. When producers/suppliers are more desperate for business, the few remaining customers can be more discriminating, more fickle. And there&#8217;s also a general &#8220;flight to quality,&#8221; particularly for capital goods and high relationship services, because customers don&#8217;t want to get stuck doing business with a company that&#8217;s going out of business. All of that probably conspires to drive IT service quality requirements up, especially in areas that might attract unwelcome public attention and customer flight such as highly visible availability failures or security breaches.</p>
<p>Thanks for working on this.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Laura DiDio</title>
		<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/comment-page-1/#comment-140</link>
		<dc:creator>Laura DiDio</dc:creator>
		<pubDate>Tue, 08 Sep 2009 22:57:40 +0000</pubDate>
		<guid isPermaLink="false">http://itic-corp.com/?p=205#comment-140</guid>
		<description>Hi, Timothy:

Thanks for visiting the site and posing your question. You are one of several people who are interested in statistics on the IBM System z reliability. And the answer is: I&#039;m working on it and will conduct a survey on this within the next six (6) weeks. So please be patient.

Your comments on defining and differentiating downtime based on the IT point of view and the end user point of view is very insightful. I&#039;ve struggled with and confronted this definition myself. When I first began doing reliability surveys about six years ago I quickly realized that the definition of downtime varied according to who answered the question. For example, many IT administrators define unplanned downtime as &lt;strong&gt;any&lt;/strong&gt; event that causes them to take a server offline regardless of the underlying cause. End users of course, are sometimes oblivious or uncaring of the reason for downtime; they are only concerned with the time, the frequency and the severity of an outage in terms of the network, applications and services being unavailable or inaccessible to them. 

Your observations about less critical server outages being under-reported also have a lot of merit and truth. The fact is that IT departments have suffered greatly because of the ongoing economic crunch -- staff, resources and training have been cut, in some cases to the bone. So naturally reporting outages and tracking key metrics like reliability, TCO, ROI, the ability to meet SLAs etc. are all suffering. 

So yes, I definitely consider the issues that you&#039;ve raised and one way to address them is that there has to be much better communication, collaboration and cooperation among C-level executives, IT departments and the physical plant facilities managers AND they must also solicit input from their end users. IT departments must also be much more diligent about keeping track of their own performance and reliability metrics as well as performing post-mortems on what remedial actions they took following a service outage.

Thanks again for your insightful comments. And keep checking back for future survey results.</description>
		<content:encoded><![CDATA[<p>Hi, Timothy:</p>
<p>Thanks for visiting the site and posing your question. You are one of several people who are interested in statistics on the IBM System z reliability. And the answer is: I&#8217;m working on it and will conduct a survey on this within the next six (6) weeks. So please be patient.</p>
<p>Your comments on defining and differentiating downtime based on the IT point of view and the end user point of view is very insightful. I&#8217;ve struggled with and confronted this definition myself. When I first began doing reliability surveys about six years ago I quickly realized that the definition of downtime varied according to who answered the question. For example, many IT administrators define unplanned downtime as <strong>any</strong> event that causes them to take a server offline regardless of the underlying cause. End users of course, are sometimes oblivious or uncaring of the reason for downtime; they are only concerned with the time, the frequency and the severity of an outage in terms of the network, applications and services being unavailable or inaccessible to them. </p>
<p>Your observations about less critical server outages being under-reported also have a lot of merit and truth. The fact is that IT departments have suffered greatly because of the ongoing economic crunch &#8212; staff, resources and training have been cut, in some cases to the bone. So naturally reporting outages and tracking key metrics like reliability, TCO, ROI, the ability to meet SLAs etc. are all suffering. </p>
<p>So yes, I definitely consider the issues that you&#8217;ve raised and one way to address them is that there has to be much better communication, collaboration and cooperation among C-level executives, IT departments and the physical plant facilities managers AND they must also solicit input from their end users. IT departments must also be much more diligent about keeping track of their own performance and reliability metrics as well as performing post-mortems on what remedial actions they took following a service outage.</p>
<p>Thanks again for your insightful comments. And keep checking back for future survey results.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Timothy</title>
		<link>http://itic-corp.com/blog/2009/07/itic-2009-global-server-hardware-server-os-reliability-survey-results/comment-page-1/#comment-139</link>
		<dc:creator>Timothy</dc:creator>
		<pubDate>Tue, 08 Sep 2009 22:35:09 +0000</pubDate>
		<guid isPermaLink="false">http://itic-corp.com/?p=205#comment-139</guid>
		<description>Do you have any IBM System z (mainframe) results, by operating system (z/OS, Linux, etc.)?

Also, one of the difficulties I find in understanding downtime is the difference between the IT point of view and the end user point of view. In my opinion the end user point of view is the only one that matters -- what could also be called &quot;business service delivery.&quot; (It&#039;s the CEO, not the CIO, basically.) The end user would count planned outages (as well as unplanned) and &quot;System is slow/I can&#039;t get my job done&quot; issues as downtime. The IT-centric view often (wrongly, I think) excludes both.

And then there&#039;s the &quot;if a tree falls in the woods...&quot; problem. That is, if there&#039;s an outage, how quickly would anyone notice? That doesn&#039;t matter as much in *relative* rankings if each type of server is deployed, in equal distributions, to similar critical roles. But of course we know that&#039;s not true: some types of servers are deployed more frequently to roles where outages are more visible and more immediately detected. If server types are deployed correctly, with the most reliable and available servers deployed to the more critical (and visible) roles, then this phenomenon would tend to compress the reported outage times. (Outages among less critical servers would be less quickly reported, on average, so the outage durations would be more under reported. &quot;When we came into the office in the morning, the server was down&quot; sort of thing.) There is also the likely factor that more critical server types are deployed with more reliable surrounding infrastructure: better and more redundant networks, as a major example.

I&#039;m wondering if you&#039;ve given some thought to these issues and whether you have any ideas for how to control for them.</description>
		<content:encoded><![CDATA[<p>Do you have any IBM System z (mainframe) results, by operating system (z/OS, Linux, etc.)?</p>
<p>Also, one of the difficulties I find in understanding downtime is the difference between the IT point of view and the end user point of view. In my opinion the end user point of view is the only one that matters &#8212; what could also be called &#8220;business service delivery.&#8221; (It&#8217;s the CEO, not the CIO, basically.) The end user would count planned outages (as well as unplanned) and &#8220;System is slow/I can&#8217;t get my job done&#8221; issues as downtime. The IT-centric view often (wrongly, I think) excludes both.</p>
<p>And then there&#8217;s the &#8220;if a tree falls in the woods&#8230;&#8221; problem. That is, if there&#8217;s an outage, how quickly would anyone notice? That doesn&#8217;t matter as much in *relative* rankings if each type of server is deployed, in equal distributions, to similar critical roles. But of course we know that&#8217;s not true: some types of servers are deployed more frequently to roles where outages are more visible and more immediately detected. If server types are deployed correctly, with the most reliable and available servers deployed to the more critical (and visible) roles, then this phenomenon would tend to compress the reported outage times. (Outages among less critical servers would be less quickly reported, on average, so the outage durations would be more under reported. &#8220;When we came into the office in the morning, the server was down&#8221; sort of thing.) There is also the likely factor that more critical server types are deployed with more reliable surrounding infrastructure: better and more redundant networks, as a major example.</p>
<p>I&#8217;m wondering if you&#8217;ve given some thought to these issues and whether you have any ideas for how to control for them.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
