For the third year in a row, IBM AIX v7.1 UNIX operating system (OS) running on the company’s Power System servers scored the highest reliability ratings and recorded the least amount of overall downtime from Tier 1, Tier 2 and Tier 3 outages among 18 different server OS platforms.
Over three-quarters or 78% of survey respondents indicated they experienced less than one of the most prevalent, minor Tier 1 incidents per server, per annum on IBM’s AIX v. 5.3 and AIX v 7.1 distributions. An 83% majority of IBM AIX v 7.1 and Novell SUSE Enterprise Linux Server 11 and 82% of Windows Server 2008 R2 survey respondents indicated their organizations experienced less than one unplanned, severe/lengthy Tier 3 outage per server, per annum (See Exhibit 1).
Microsoft’s Windows Server 2008 R2 (which scored the biggest year-over-year reliability gains), and Novell’s SUSE Enterprise Linux Server 11 closely challenged IBM’s AIX v 7.1 server OS reliability and uptime – particularly with respect to the most severe and costly Tier 3 outages. Unplanned Tier 3 outages – whether manmade or as the result of a disaster — typically cause downtime in excess of four hours. There is widespread disruption of applications and network operations; customers and business partners are frequently impacted and Tier 3 incidents will almost always require remediation by a significant portion of the IT staff.
Those are the results of the ITIC 2011 Global Server Hardware and OS Reliability Survey. ITIC partnered with GFI Software (formerly Sunbelt Software) to conduct this independent Web-based survey. It polled C-level executives and IT managers at over 500 corporations from 23 countries worldwide from November 2010 through April 2011. ITIC updated and revised the survey in October 2011. And while the results of this latest survey were very similar, they also reflect the turmoil and controversy surrounding some of the vendors – most notably Hewlett-Packard and Oracle – that have the potential to impact reliability in the future.
On the server hardware side, IBM, Stratus Technologies, Fujitsu and Hewlett-Packard servers (in that order) were the four most reliable platforms, achieving an average 99.99% or 99.999% uptime per server, per annum. That equates to 5.25 minutes (99.999%) to 52 minutes (99.99%) of unplanned per server annual downtime.
The survey data indicated that the inherent reliability and uptime of all the major server OS and server hardware distributions has improved significantly over the past several years. In particular, IBM AIX and Novell SUSE Linux Enterprise Server consistently have maintained the highest reliability scores their platforms recorded in the prior ITIC 2008 and 2009 Global Server Hardware and Server Operating System surveys. IBM AIX and Novell SUSE users also praised the manageability and technical service and support available for those server operating systems.
Microsoft’s Windows Server 2008 and Windows Server 2008 R2 scored impressive reliability gains in the 2011 survey compared to the prior 2008 and 2009 polls. Survey respondents now rank Windows Server 2008 R2 as among the top three most reliable, mainstream server operating systems. Windows Server 2008 R2’s reliability renaissance is especially impressive since Microsoft’s Windows Server OS noticeably lagged behind the majority of the UNIX, Linux and Open Source distributions in the ITIC/Sunbelt Software (now GFI Software) 2008 and 2009 Server Reliability surveys. Among the other survey highlights:
- Security: The biggest surprise of the survey was the strong security showing by Windows Server 2008 R2
- Highest Percentage of Severe Tier 3 Outages: Oracle’s Solaris 10 running on SPARC hardware had the highest percentage of survey respondents — 16% – who said they experienced at least three and more than 12 unplanned per server, per annum prolonged Tier 3 outages lasting more than four hours.
- Lowest Percent of Severe Tier 3 Outages: By contrast just 5% of both IBM AIX v 7.1 and Microsoft Windows Server 2008 R2 survey participants reported experiencing at least three and more than 12 unplanned per server, per annum Tier 3 outages of more than four hours duration.
- Integration and interoperability issues: (e.g., incompatible drivers) and problems applying patches are the most common culprits of protracted, unplanned downtime. These problems are exacerbated when IT managers are forced to spend valuable time searching for fixes, or if the vendor has not yet recognized the issue and there is no fix.
- Server Age: A 57% majority of respondents said their servers – particularly the critical main line of business servers — are between one and three years old. Keeping the server hardware updated has a major impact on reliability. One-in-five survey respondents – 20% – said their servers were three-to-four years old.
- Server Refresh Rates: One-quarter – 25% — of businesses refresh their main line of business server hardware “as needed” and 10% said they upgrade a portion of their servers annually. However, 17% of survey participants said their organizations refreshed their main line of business server hardware every five-to-six years and another 15% indicated they had no specific timetable.