Getty Images/iStockphoto

Tip

Data center redundancy: The basics

Downtime can cost businesses thousands, and redundancy is one way to minimize disruptions. Assess uptime requirements when building redundancy in data center facilities.

Devin Partida, ReHack.com

Published: 01 Jun 2023

Information and uptime have become increasingly valuable, which raises the importance of data center redundancy.

Outages are costly and will only become more so over time, so organizations must do all they can to prevent unplanned downtime. Data center redundancy is one of the most important factors in achieving that goal.

What is data center redundancy?

Data center redundancy refers to using duplicate components to keep operations uninterrupted if some components fail and maintain uptime during maintenance. Because power-related issues cause 43% of significant data center outages, according to a 2022 Uptime Institute study, uninterruptible power supplies (UPSes) and generators are some of the most common targets for redundancy. Cooling systems are another common component to back up, as their failure could cause critical issues.

Why data center redundancy is important

While data center redundancy means spending more on hardware, the rising cost of data center downtime justifies the higher upfront expenses. In 2019, a single hour of enterprise server downtime cost between $301,000 and $400,000 for 25% of businesses, Statista found in a 2022 study. For many organizations, the prices are even higher, and they will keep climbing as data access and cloud services play central roles in a business.

Redundancy minimizes a company's chances of falling into those high costs, helps organizations recover from disruptions faster and keeps infrastructure running in case of outages. Redundancy can also help organizations ensure they meet service-level agreements.

Many businesses are increasing their data collection and analysis because it can improve decision-making, streamline operations and more. However, this trend leaves organizations with considerable amounts of sensitive information on hand, raising legal and ethical concerns in the event of a breach. Redundancy helps ensure data technologies work as they should if some components fail, leaving fewer openings for these breaches.

Data center redundancy levels

Data center redundancy comes in various levels. Businesses that want to make the most informed decisions about their data center architecture must understand these levels and their meaning.

Redundancy levels center around the concept of N, which means the minimum infrastructure necessary to run a data center at full capacity. For example, if a data center needs four UPS units to run, N would represent four units. N also applies to other components, like cooling systems, networking systems, storage systems and others.

The lowest level of redundancy is N+1, which means a data center has one extra component. Similarly, N+2 architecture provides two redundant components for a given N value.

N+1 is a more common architecture than N+2 because it enables redundancy and minimizes hardware costs. Many organizations may prefer N+1 for its cost-effectiveness since they can buy less hardware.

2N represents 100% redundancy, where data centers have an identical backup to their required components. In a data center where N is the number of UPS units, 2N means having twice as many. Some architecture goes even further and provides 2N+1, which equals a complete backup plus another component.

Data center tiers

The N system is a helpful way to measure redundancy, but in practice, achieving maximum uptime is about more than simply adding components. Uptime Institute created a tier system to "explain the infrastructure required for data center operations."

There are four major tiers.

Tier I data centers

Tier I data centers are the most basic. These facilities have enough redundant infrastructure to run efficiently but need more redundancies. They can withstand disruption from human error but not an unexpected outage and must shut down for maintenance.

Tier II data centers

A Tier II data center includes some cooling and power system redundancy, providing more uptime. Employees can remove components without shutting the data center down, but large failures still take the facility offline.

Tier III data centers

Tier III data centers ensure redundancy for every component in the facility. A failure at any one point does not affect data center operations. Shutdowns are not needed to replace or maintain equipment.

chart of Uptime Institute's data center tiers

Tier IV data centers

Tier IV data centers represent maximum uptime. These facilities have several independent and isolated backup systems requiring 2N or 2N+ redundancy levels. Downtime is unlikely in these data centers, though maintaining them is costly.

Tips for choosing the right data center redundancy level

Any organization that relies on data center operations needs redundancy, but requirements vary between situations. Decide what level the business needs, and consider the company's IT budget. Keep the cost of potential downtime in mind since it can be costly.

IT teams should also consider their risk tolerance. Businesses with little sensitive data or where cloud environments are not mission-critical can afford more risk, so N+1 architecture is likely sufficient. However, organizations might need more redundancy if they rely more heavily on the cloud or are in more highly regulated industries.

Consider legal requirements and security in these decisions. Some regulations may require higher uptime. Similarly, companies facing greater cybersecurity risks should aim for higher redundancy to mitigate cyber attacks. Regardless of an organization's level, automated monitoring tools can accelerate incident responses to help prevent downtime.