130 King Street West, Suite 1800 | Toronto, ON | M5X 1E3
416.865.3392

Tri-Paragon Inc.

Preventing Data Centre Downtime

Data Center Standard

Managing Data Centre Operations

 

In recent years, data centre infrastructure has become significantly more reliable and management practices have improved, so it would be fair to expect that the number of reported downtime incidents is decreasing. But this is not the case.

According to a 2018 survey by Uptime Institute, 31% of respondents experienced a downtime incident or severe degradation in the last year and 48% reported at least one outage at their site or at a service provider in the last three years.

Downtime is expensive. It costs both time and money and can have grave consequences for organizations that are not sufficiently prepared.

Outage Reasons

According to Gartner, downtime costs $5,600 per minute on average. This results in average costs between $140,000 and $540,000 per hour depending on the organization. Some factors that contribute to the costs associated with downtime include:

  • Lost sales. For organizations that do business online, downtime directly results in customers being unable to make purchases, losing potential revenue. If the business is dependent on network availability to deliver a service, downtime makes it impossible to communicate with users.

Lost Sales

  • Brand reputation. If customers frequently must deal with outages that prevent them from easily making purchases or using a service, they will cease being a customer and share their bad experiences, scaring away potential customers.

Brand Reputation

  • Reduced productivity. Modern businesses are heavily dependent on online communications and services. Without network access, productivity often grinds to a halt as employees lose the ability to get much of their work done, production lines shut down, or other aspects of the business are stunted.

Reduced Productivity

  • Some companies include language in SLA uptime contracts that defines compensation owed in the event of unplanned downtime.

SLA

  • Lost data. During outages, data can be corrupted, and opportunities can be created for cyberattacks that damage data. Data is typically backed up, but the outage can scare customers and shatter their confidence.

Data Loss

  • The number one cause of data centre failures is UPS failure followed by human error. Other common causes are network failure, power outages, natural disasters, and cyber crimes. Fortunately, there is a solution that helps prevent downtime.

Outage Reasons

Sunbird’s Data Centre Infrastructure Management (DCIM) software allows data cente mangers to avoid unplanned downtime that can cost hundreds of thousands of dollars per outage and wreak havoc on your business. Some of the ways to prevent human error and maximize uptime with DCIM are:

  • Manage inlet air temperature and humidity. The temperature and humidity of air at the inlet of cabinets is important because this is the air that flows through the cabinet to decrease the heat. If the inlet air is too warm, the cabinet will not cool properly. If the air is too humid, there is a risk of corrosion and damaged equipment. And if the air is too dry, there could be a static electrical discharge. All of which these can cause costly downtime. DCIM software collects data from environmental sensors in the data centre and displays the information in business intelligence dashboards and 3D floor map visualizations to help you monitor your data centre environment and identify hot spots.
  • Safely increase temperature. Increasing temperatures in the data centre can improve energy efficiency, but it comes with the risk of overheating and damaging equipment, resulting in downtime. With DCIM, you can set temperature thresholds and receive alerts when temperatures are outside of your desired range. Similarly, DCIM will help you avoid overcooling to optimize efficiency and reduce energy costs.
  • Ensure power redundancy. Due to the increasing demand of computing hardware, data center cabinets are now packed more densely with power-hungry IT equipment. And since data centre teams are often focused on fully utilizing existing resources and delaying capital expenses, they may not be aware that a cabinet is overloaded until it is too late. This makes power redundancy in the event of equipment failure a critical component of any strategy to maximize uptime. DCIM software allows you to run a failover simulation report and identify what cabinets are at risk and what equipment can continue functioning safely if a PDU goes down. Data centre managers can leverage this information to make necessary changes to the loads before there is a real failure.
  • Health polling. Ensuring that intelligent PDUs and other devices are operating properly and accessible via your network is important to maintaining uptime. It is not impossible for equipment to go down without anyone noticing. A technician or engineer may place a PDU into maintenance mode accidentally, neglect to power on new resources, or connect equipment by the incorrect ports or cables. With DCIM software, you limit the possibility of outages caused by malfunctioning equipment by polling intelligent PDUs and other equipment at user-configurable intervals to ensure that they are accessible. If the device is not reachable, the software alerts you immediately, so you are aware of the issue before there is a crisis.

Data Centre Health

With Sunbird DCIM software, you can simulate failover and test what-if scenarios with reports that identify available capacity to ensure coverage in case of failure, visualize data centre and facility health status with a red-yellow-green color-coded health map that provides an at-a-glance view of rack load levels, line currents, and environmental conditions, and be alerted of threshold violations with automated emails that enable the quick identification of hotspots and potential trouble issues. With these capabilities, DCIM will help protect your infrastructure in the event of a data centre disaster.

For more information and to schedule a one-on-one demo of the Tri-Paragon Sunbird DCIM Data Centre Software send an email to info@triparagon.com or call Roy at (416) 865-3392.

Recent Posts