ITIC: Home

 IBM z Systems Enterprise; IBM Power Systems Servers Most Reliable for Ninth Straight Year;  Lenovo x86 Servers Deliver Highest Uptime/Availability among all Intel x86-based Systems

For the ninth year in a row, corporate enterprise users said IBM’s z Systems Enterprise mainframe class server achieved near flawless reliability, recording less than 10 seconds of unplanned per server downtime each month. Among mainstream servers,  IBM Power Systems devices and the Lenovo x86 platform delivered the highest levels of reliability/uptime among 14 server hardware and 11 different server hardware virtualization platforms.

Those are the results of the ITIC 2017 Global Server Hardware and Server OS Reliability survey which polled 750 organizations worldwide during April/May 2017.

Among the top survey findings:

  • IBM z Systems Enterprise mainframe class systems, had the lowest incident – 0% — of > 4 hours of per server/per annum downtime of any hardware platform. Specifically, IBM z Systems mainframe class servers exhibit true mainframe fault tolerance experiencing just 0.96 minutes of   of unplanned per server annual downtime. That equates to 8 seconds per month or “blink and you miss it,” 2 seconds of unplanned weekly downtime. This is an improvement over the 1.12 minutes of per server/per annum downtime the z Systems servers recorded in ITIC’s 2016 – 2017 Reliability poll nine months ago.
  • Among mainstream hardware platforms, IBM Power Systems and Lenovo System x running Linux have least amount of unplanned downtime 2.5 and 2.8 minutes per server/per year of any mainstream Linux server platforms.
  • 88% of IBM Power Systems and 87% of Lenovo System x users running RHEL, SuSE or Ubuntu Linux experience fewer than one unplanned outage per server, per year.
  • Tenly two percent of IBM and Lenovo servers recorded >4 hours of unplanned per server/per annum downtime; followed by six percent of HPE servers; eight percent of Dell servers and 10% of Oracle servers.
  • IBM and Lenovo hardware and the Linux operating system distributions were either first or second in every reliability category, including virtualization and security.
  • Lenovo x86 servers achieved the highest reliability ratings among all competing x86 platforms
  • Lenovo Takes Top Marks for Technical Service and Support: Lenovo tech support the best followed by Cisco and IBM
  • Some 66% of survey respondents said aged hardware (3 ½+ years old) had a negative impact on server uptime and reliability vs. 21% that said it has not impacted reliability/uptime. This is 22% increase from the 44% who said outmoded hardware negatively impacted uptime in 2014
  • Reliability continues to decline for the fifth year in a row on the HP ProLiant and Oracle’s SPARC & x86 hardware and Solaris OS. Reliability on the Oracle platforms declined slightly mainly due to aging. Many Oracle hardware customers are eschewing upgrades, opting instead to migrate to rival platforms.
  • Some 16% of Oracle customers rated service & support as Poor or Unsatisfactory. Dissatisfaction with Oracle licensing and pricing policies remains consistently high for the last three years.
  • Only 1% of Cisco, 1% of Dell, 1% of IBM and Lenovo, 3% of HP, 3% of Fujitsu and 4% of Toshiba users gave those vendors “Poor” or “Unsatisfactory” customer support ratings.

And continuing a trend that has manifested over the past three years, Human Error and Security, respectively are the chief issues that negatively impact server hardware/server operating system reliability and cause downtime.      

Unsurprisingly, in the 21st Century Digital Age, the functionality and reliability of the core, foundation server hardware and server operating systems is more crucial than ever. The server hardware and the server OSes are the bedrock upon which the organization’s mainstream line of business (LOB) applications rest.  High reliability and near continual system and application availability is imperative for organizations’ core on-premises, cloud based and Network Edge/Perimeter environments. Infrastructure — irrespective of location – is essential to the overall health of business operations.

The inherent reliability and robustness of server hardware and the server operating systems are the singularly most critical factors that influence, impact and ultimately determine the uptime and availability of mission critical line of business applications, virtual machines (VMs) that run on top of them and the connectivity devices that access them.

On a positive note, the inherent reliability of server hardware and server operating system software  as well as advancements in the underlying processor technology, all continue to improve year over year. But the survey results also reveal that external issues, most notably human error and security breaches,  have also assumed greater significance in undermining system and network accessibility and performance.

The overall health of network operations, applications, management and security functions all depend on the core foundation elements: server hardware, server operating systems and virtualization to deliver high availability, robust management and solid security. The reliability of the server, server OS and virtualization platforms form the foundation of the entire network infrastructure. The individual and collective reliability of these platforms have a direct, immediate and long lasting impact on daily operations and business results.

The ITIC survey also polled customers on the minimum acceptable reliability requirements for their organizations’ main line of business servers and applications.

Reliability Trends

  • Majority of corporations Need “Four Nines” of Uptime. Some 79% of corporations now require a minimum of 99.99% uptime for mission critical hardware, operating systems & main line of business (LOB) applications. This is a 30% increase of  from the 49% of respondents who said their firms required a minimum “four nines” of uptime in the 2014 survey.
  • Cost of Hourly Downtime Increases: 98% of firms say hourly downtime costs exceed $150K; 31% of respondents estimate hourly downtime costs their companies up to $400K; this is a seven percent increase from 2014 survey & 33% indicate that one hour of downtime now costs $1M to >$5M
  • Security, BYOD and mobility pose the biggest technology threats to reliability
  • Technical service & support and fast, efficient vendor responsiveness are crucial
  • Overall Top Issues Negatively impacting network reliability are:
  • Human Error (e.g., misconfiguration, right-sizing server workloads etc.) – 80% vs. 49% in 2015 poll
  • Complexity – involving provisioning, deployment & usage of new technologies e.g. Data Analytics, IoT, Network Edge/Perimeter and mobile apps
  • Increased workloads on aging hardware

Human Error Overtakes Security as Chief Cause of Downtime

The survey also showed that the three technology issues of most concern according to this year’s survey are: Security, Disaster Recovery and Backup and Business Continuity. At the same time, the survey results find that 80% of respondents cited human error as the chief culprit of unplanned downtime, surpassing Security issues which were pinpointed by 59% of those polled.

Additionally, ITIC’s latest 2017 Reliability research reveals that a variety of external factors are having more of a direct impact on system downtime and overall availability. These include overworked and understaffed IT departments; the rapid mainstream adoption of complex new technologies such as the aforementioned IoT, Big Data Analytics, virtualization and increasing cloud computing deployments and the continuing proliferation of BYOD and mobility technologies.

In the context of its Reliability Surveys, ITIC broadly defines human error to encompass both the technology and business mistakes organizations make with respect to their network equipment and strategies.

Human error as it relates to technology includes but is not limited to:

  • Configuration, deployment and management mistakes
  • Failure to upgrade or right size servers to accommodate more data and compute intensive workloads.
  • Failure to migrate and upgrade outmoded applications that are no longer supported by the vendor.
  • Failing to keep up to date on patches and security.

Human error with respect to business issues includes:

  • Failure to allocate the appropriate Capital Expenditure and Operational Expenditure funds for equipment purchases and ongoing management and maintenance functions
  • Failure to devise, implement and upgrade the necessary computer and network to address issues like Cloud computing, Mobility, Remote Access, and Bring Your Own Device (BYOD).
  • Failure to construct and enforce strong computer and network security policies.
  • Ignorance of Total Cost of Ownership (TCO), Return on Investment (ROI).
  • Failure to track hourly downtime costs.
  • Failure to track and assess the impact of Service Level Agreements and regulatory compliance issues like Sarbanes-Oxley (SOX), Health Insurance Portability and Accountability Act (HIPAA).

Conclusions

Reliability is and will continue to be among the most crucial metrics in the organization. Improvements or declines in reliability can either mitigate or increase technical and business risks to the organization’s end users and its external customers.  The ability to meet service-level agreements (SLAs) hinges on server reliability, uptime and manageability. These are key indicators that enable organizations to determine which server operating system platform or combination thereof is most suitable.

To ensure business continuity and increase end user productivity, it is imperative that businesses maximize the reliability and uptime of their server hardware and server operating systems. A 79% majority of corporations now require “four nines” or 99.99% minimum uptime. Organizations are advised to “right size” their server hardware to accommodate increased workloads and larger applications. Businesses should also regularly replace, retrofit and refresh their server hardware and server operating systems with the necessary patches, updates and security fixes as needed to maintain system health. At the same time, server hardware and server operating system vendors should be up front and provide their customers with realistic recommendations for system configurations to achieve optimal performance. Vendors also bear the responsibility to deliver patches, fixes and updates in a timely manner and to inform customers to the best of their ability regarding any known incompatibility issues that may potentially impact performance. Vendors should also be honest with customers in the event there is a problem or delay with delivering replacement parts.

 

Protecting and maintaining brand reputation is essential for any company. As a result, it is essential for enterprises to proactively monitor and manage all activities- operational and experiential – that influence a consumer’s overall brand experience. Ignorance involving any aspect of business operations will result in ongoing, significant consequences. It will damage a corporation’s reputation; adversely impact customers; result in operational inefficiencies, business losses and potential litigation; and even criminal penalties. It also raises the corporation’s risk of non-compliance with crucial local, state, federal and international industry regulations.

This is especially true for firms in fast-paced, competitive and highly regulated industries, including but not limited to the food, hospitality, hotel, restaurant, retail and transportation vertical markets. Typically, these organizations have dozens, hundreds or even thousands of stores, restaurants and hotels located in multiple, geographically remote locations. They must collect, aggregate and analyze a veritable data deluge in real-time. And they must respond proactively and take preventative measures to correct issues as they arise. Organizations that do business across multiple states and internationally, face other challenges. They must synchronize and integrate processes and data across the entire enterprise. Businesses must also ensure that every restaurant, hotel or retail store in the chain, achieves and maintains compliance with a long list of complex standards, health and safety laws.

ITIC’s research indicates that companies across a wide range of industries are deploying a new class of Quality Experience Management software. These solutions let businesses access the latest information on daily operations, policies, procedures and safety mechanisms in an automated fashion. They also let companies take preventative and remedial action irrespective of time, distance or physical location.

Quality Experience Management software with built-in Business Intelligence tools can deliver immediate and long-term benefits and protect the corporate brand. ITIC’s customer-based research shows that RizePoint, based in Salt Lake City, UT – with 20 years’ experience in audit compliance monitoring, reporting and correction – is the clear market leader. Its software delivers brand protection and risk mitigation with mobile and cloud capabilities, increasing efficiency and productivity.

Overview

Non-compliance is expensive, risky and unacceptable.

This is true of all companies – from the smallest organizations with fewer than 25 employees to the largest multinational global enterprises of over 10,000 employees regardless of vertical market.

Damage to the corporation’s brand and reputation due to non-compliance can be severe and protracted. In worst-case scenarios, a company’s brand may be so irreparably damaged that the firm goes bankrupt or out of business, altogether.

Firms that run afoul of requirements by authorities like the U.S. Food and Drug Administration (FDA) and its compliance arm, the Occupational Safety and Health Administration (OSHA) as well as international compliance and standards bodies like the European Union (EU) and the Association of Southeast Asian Nations (ASEAN), risk severe criminal, civil and legal penalties.

These include fines, inspections/re-inspections and even jail time for company executives. The penalties may cost organizations, thousands, tens of thousands and, in extreme cases hundreds of thousands for each separate incidence or occurrence. Additionally, non-compliant corporations are at high risk of litigation from their business partners, vendors and customers at every juncture in the increasingly global supply chain. Corporations can also be sued by consumers.

The direct and indirect monetary penalties for regulatory non-compliance are high and growing ever more expensive as governments and their regulatory agencies enact new laws to safeguard their various global supply chains.

An organization’s reputation is one of its most valuable assets. Safeguarding the integrity of the corporate brand and the reliability of its daily operations via automated, Quality Experience Management software is imperative for every organization that expects to thrive and maintain the confidence and satisfaction of its customers – and comply with laws.

In the 21st century digital age of interconnected networks, no vertical industry is a standalone silo. Every market segment belongs to a macrocosm in which vendors and customers are linked by their inter-dependencies. Organizations that participate in the agri-food supply chain are also impacted by operations in a wide range of verticals including: finance; healthcare; retail; services; transportation (airlines, shipping and trucking) and weather, to name a few. In managing daily operations, companies in the food, restaurant, hospitality, hotel and retail industries must pay close attention to the goings-on in the aforementioned industries. Any issues with weather or shipping, for example could potentially disrupt business. The symbiotic and global nature of today’s business environment requires agility, scalability, innovation and automated real-time access to analyze data. Quality Experience Management software addresses the challenges of the data deluge.

Businesses that prosper in today’s complex economic environment and supply chain are increasingly turning to automated quality experience management tools that enable them to assess risks and make informed decisions. These solutions, which incorporate advanced data analytics and BI functionality, enable organizations to collect and analyze a wide variety of data. This involves monitoring ongoing daily, monthly, quarterly and annual operations, comparing and contrasting pricing and purchasing trends to gain a competitive edge over rivals, and increasing brand visibility in competitive markets. Fortunately, quality experience management tools are not only more affordable today than in the past, but they are easier to deploy, provision and use.

This emerging class of quality experience management tools obviates the need to use outmoded manual spreadsheets to perform and document the results of audits and inspections. Using spreadsheets to manually input data is complex and time-consuming. Additionally, it often fails to capture key pieces of data and manual updates are also more error-prone.  By contrast, use of quality experience management software tools that incorporate BI and analytics capabilities deliver immediate and tangible business benefits across a wide variety of vertical markets. These include:

  • Inspection information is now quickly and efficiently collected and shared.
  • Issues and concerns are immediately shared with regional executive management and the pertinent local managers who can take immediate remedial corrective and preventative action.
  • Line of business managers and C-level executives can utilize the data generated to identify key trends and compare results across the business, using audit data such as: compare/contrast data – including policies, procedures, revenue, safety issues, results of inspections and food and beverage quality across multiple locations throughout the enterprise in order to spot key trends on revenue, spending, pricing and compliance. This in turn empowers the business to make strategic business and technology decisions that can positively impact the bottom line, deliver a competitive advantage and avoid compliance issues or safety hazards before they occur.
  • Improved productivity among the corporation’s workers and increased customer satisfaction.

Conclusions

Maintaining and safeguarding a brand’s reputation requires consistent adherence to brand standards. RizePoint is clearly positioned as the Quality Experience Management market leader, with accelerated customer interest and adoption, and a clear and visionary product road-map that includes specific feature value for customers in the food service, retail and hospitality industries. RizePoint makes that easier by letting clients tap into the latest mobile and cloud technology.

That technology also makes it easier for corporations to ensure regulatory compliance. Clients can then use their next-generation auditing solution as the launching pad for important new business initiatives and opportunities.

Corporate enterprises across all vertical market segments should view auditing and compliance as core requirements. They are absolutely essential to guarantee that the organization effectively monitors, manages and maintains the highest policies and procedures for its ongoing daily operations. Quality Experience Management software solutions that incorporate analytics and BI capabilities also serve as tactical tools and strategic competitive assets that assist organizations in the food, hospitality, hotel and restaurant verticals can better meet and serve the needs of every customer, business partner and supplier in their supply chains.

 

The cost of downtime continues to increase as do the business risks. An 81% majority of organizations now require a minimum of 99.99% availability. This is the equivalent of 52 minutes of unplanned outages related to downtime for mission critical systems and applications or ,just 4.33 minutes of unplanned monthly outage for servers, applications and networks.                                         

 Over 98% of large enterprises with more than 1,000 employees say that on average, a single hour of downtime per year costs their company over $100,000, while an 81% of organizations report that the cost exceeds $300,000. Even more significantly: three in 10 enterprises – 33% – indicate that hourly downtime costs their firms $1 million or more (See Exhibit 1). It’s important to note that these statistics represent the “average” hourly cost of downtime.  In a worst case scenario – if any device or application becomes unavailable for any reason the monetary losses to the organization can reach millions per minute. Devices, applications and networks can become unavailable for myriad reasons. These include: natural and man-made catastrophes; faulty hardware; bugs in the application; security flaws or hacks and human error. Business-related issues, such as a Regulatory Compliance related inspection or litigation, can also force the organization to shutter its operations. For whatever the reason, when the network and its systems are unavailable, productivity grinds to a halt and business ceases.   

Highly regulated vertical industries like Banking and Finance, Food, Government, Healthcare, Hospitality, Hotels, Manufacturing, Media and Communications, Retail, Transportation and Utilities must also factor in the potential losses related to litigation as well as civil penalties stemming from organizations’ failure to meet Service Level Agreements (SLAs) or Compliance Regulations. Moreover, for a select three percent of organizations, whose businesses are based on high level data transactions, like banks and stock exchanges, online retail sales or even utility firms, losses may be calculated in millions of dollars per minute.

 Those are the results of ITIC’s 2017 Reliability and Hourly Cost of Downtime Trends Survey, an independent Web-based survey which polled over 800 organizations in April/May 2017. All categories of businesses were represented in the survey respondent pool: 24% were small/midsized (SMB) firms with up to 200 users; 25% came from the small/midsized (SME) enterprise sector with 201 to 1,000 users and 51% were large enterprises with over 1,000 users. 

These statistics are not absolute. They are the respondents’ estimates of the cost of one hour of hourly downtime due to lost revenue and lost end user productivity. Additionally, these figures do not take into account the cost of additional penalties for regulatory non-compliance or “good will” gestures made to the organization’s customers and business partners that were negatively impacted by a system or network failure. In fact, these two conditions can cause downtime costs to skyrocket even further.

The overarching message is clear: downtime of even a few minutes is expensive and unwelcome. Only two percent of enterprise respondents said that downtime costs their companies less than $100,000 in a single 60-minute time period. Downtime costs are similarly high for small and midsized businesses (SMBs) with one to 150 employees; some 47% of SMB survey respondents estimate that a single hour of downtime can cost their firms $100,000 in lost revenue and end user productivity. To reiterate these figures are exclusive of penalties, remedial action by IT and any ensuing monetary awards that are the result of litigation, civil or criminal non-compliance penalties.There is well documented evidence from a variety of sources that track the skyrocketing cost of downtime.  The expenses and losses associated with downtime continue to climb in the Internet age where business is conducted 24 x 7 across global time zones. Hourly losses of hundreds of thousands or millions per hour or even minutes in transaction-heavy environments are unfortunately commonplace.

 ITIC’s survey revealed that for large enterprises with over 1,000 employees, the costs associated with a single of hour of downtime are much higher, with average hourly outage costs topping the $5 Million (US Dollars) mark for nine specific verticals. These include: Banking/Finance; Government; Healthcare; Manufacturing; Media & Communications; Retail; Transportation and Utilities. The ITIC survey data revealed that although monetary losses topped users’ list of downtime concerns, it was not the only factor worrisome to organizations. The top six business consequences that concerned users are (in order):

  • Transaction/sales losses
  • Lost/damaged data
  • Customer dissatisfaction
  • Restarting/return to full operation
  • Damage to the company’s brand and reputation
  • Regulatory compliance exposure

The message is clear: unplanned downtime is costly and unacceptable from both a business and technology perspective. Organizations must proactively work with their infrastructure and cloud vendors to ensure the inherent reliability of their systems, applications and networks. This is imperative as the industry moves to interconnected Internet of Things (IoT) ecosystems.