aleksandar nakovski - stock.adob

Tip

Calculate UPS battery backup time to prepare for power failure

Extreme heat and inadequate cooling systems can lead to power failures in data centers. Calculate the duration of your UPS battery supply to prepare for backup power needs.

Data center uninterruptible power supply battery duration can substitute for a generator, but incorrect performance calculations can make it costly. Cooling doesn't run on UPS, so the room temperature rises faster than people realize. Business continuity requires continuous cooling, and that requires generators.

Data centers have high power loads, contained aisles and densely loaded cabinets, which cause temperatures to rise quickly if cooling fails. There are ways to extend the time before failure by minutes, but without those measures, installing more than 30 minutes of uninterruptible power supply (UPS) battery is usually an unnecessary cost.

When power fails, the data center design impacts temperature rise time. Three design types include the following:

  • UPS in a separate room (best practice).
  • UPS and ITE (IT equipment) in the same room (common in small data centers).
  • Hot or cold aisle containment.

A much more complex combination of factors determines how long hardware will run without cooling. When hardware overheats, performance degrades and then it fails or shuts down for self-preservation.

The following examples are for smaller facilities since larger data centers usually have generators. Run times assume the rooms are closed, have no supplemental ventilation and are well insulated, and that ITE equipment fans are running on UPS, resulting in well-mixed air.

Metal absorbs heat for a short time before temperatures rise uniformly. Hardware thermal mass depends on cabinet surface area, equipment weight and heat load. This calculation is complex and inexact, so a factor between 0.5 and 1.25 minutes has been estimated in the illustrations.

Example measurements when UPS is isolated

The room is sized for a non-redundant UPS, batteries, bypass and distribution gear. Electrical panelboards and air conditioners are set to 73 degrees Fahrenheit (22.7 degrees Celsius), which is the recommended temperature for batteries. The UPS is rated to 109 F (42.8 C). It may not fail at that temperature, but it severely impacts performance.

UPSes will shut down for self-preservation. Power draw increases rapidly as hardware gets hotter because both UPS and IT fan speeds increase dramatically. Batteries also discharge faster at higher temperatures. Larger rooms, such as for redundant systems, could have longer run times.

UPS measurements when in a separate room.
Figure 1. Temperature measurements of UPS in a separate room.

Example measurements for UPS with no aisle containment

Installing the UPS and ITE in the same room uses shared cooling and less building space. It's a common compromise in small data centers but makes calculating temperature rise more difficult.

IT hardware in cabinets will likely reach maximum temperature before the UPS. The combined ITE, cabinets, UPS and air conditioners have higher thermal mass than the UPS alone, somewhat slowing the heat rise time in the room. However, the example data center is small (1,000 ft2 or 93 m2 with 20 cabinets), so the temperature will rise rapidly.

The maximum air temperature measured before server failure is around 113 F (45 C). Internal fans maintain the junction temperatures of CPUs, so maximum air temperature entering the room is determined by fan capacity. Server performance degrades before actual failure. Mechanical hard drives and tape drives fail at significantly lower temperatures.

Without aisle containment, air is considered mixed. In Figure 1, the large UPS is in its own room, so it does not contribute to the thermal load.

In the following, two UPS sizes are illustrated, running at different ITE loads in two different sized computing spaces. Rooms are air conditioned to 75 F (23.9 C). This is energy efficient, but below the ASHRAE recommended maximum (80.6 F or 27 C). This ensures proper inlet temperatures to all the ITE.

UPS heat added to ITE heat load with no aisle containment.
Figure 2. UPS inside the data center. UPS heat added to ITE heat load with no aisle containment.

Example measurements for UPS with aisle containment

Hot aisle containment, while the most energy efficient, traps heat in a small space with virtually no exposed metal to absorb it. Therefore, the temperature rises quickly inside cabinets.

Cold aisle containment leaves most of the room volume to heat up. If the UPS is in the same room, its heat is added to the ITE heat. While the calculated temperature rise time is significantly longer than with hot aisle containment, the number may be misleading since servers are trying to pull air from the contained cold aisle, which has no added air. Heat might recirculate within the cabinets and significantly reduce time to failure.

UPS heat added to ITE heat load with aisle containment.
Figure 3. UPS inside the data center. UPS heat added to ITE heat load with aisle containment.

How to extend power failure run time

Proposals to limit rising temperatures would impact data center designs. Design changes include the following:

  • Bigger floors and higher ceilings.
  • Higher ITE ratings.
  • Lower room temperature.
  • Lower density IT cabinets.
  • No aisle containment.
  • More rows with fewer cabinets.
  • Reduced room insulation.

Equations are highly complex, and information from equipment manufacturers is scarce, so calculations are estimates. However, these estimates give reasonable indications of how long a facility might operate without air conditioning before the heat buildup exceeds thermal limits of the equipment.

Cooling methods to try

Rapid temperature rise can damage ITE, so slowing the rate of rise is crucial.

Open contained aisle doors so air can circulate through the room. This does not solve temperature rise conditions as it only exposes the ends of aisles and leaves the space above cabinets closed, but it can extend run time. Portable fans can help move air, but there may be less than a minute available to do this in high-density data centers (shown in Figure 3).

If cooling is achieved through chilled water-cooling room air handlers, wire them to maintain fans on UPS after a power failure. Auxiliary pumps on UPS, as well as some chilled water storage, can extend run time by several minutes. In this case, additional battery can be useful.

If available, use free cooling immediately upon power failure regardless of outside air temperature. Otherwise, install hot air exhaust fans on UPS and damper ducts to bring fresh air in from outside.

Equation key

Tmin: Thermal rise time in minutes.

ft3 or m3: Room volume (length x width x height) -- for aisle containment, only use contained volumes.

TD: Temperature differential (maximum rated equipment temperature minus air delivery temperature).

Watts: Total ITE load plus in-room UPS heat load. For UPS, WattsEQUIV = BTU / 3.412.

τMassmin: Thermal mass heating time estimate (0.5 to 1.5 minutes). High power densities reduce time.

Simplified method of calculation

Only four factors account for 95% of the rate of thermal rise. If most data centers have relatively full cabinets, operate close to the ASHRAE recommended temperature range (80.6 F or 27 C) and in the middle of the relative humidity range (20% to 80%), and are not above 5,000 ft (1,500 m) altitude, we can consider many of the terms in the complex equations as constants. This simplifies the equations to where anyone can calculate an indication of time to equipment failure after a power failure. Actual time to failure could be longer or shorter.

Following are these simplified equations in both the British imperial system and metric units, along with two sample calculations. Note that imperial and metric equations yield results within 2% of each other depending on unit conversion accuracies.

Tmin = (ft3 x 0.41 x TD/Watts) + τMassmin

Example imperial units to use to calculate equipment failure time after power failure.
Figure 4. Example of imperial (English) units to use to calculate an indication of time to equipment failure after a power failure.

Tmin = (m3 x 25.5 x TD/Watts) + τMassmin

Example metric units to use to calculate equipment failure time after power failure.
Figure 5. Example metric units to use to calculate an indication of time to equipment failure after a power failure.

Robert McFarlane is principal in charge of data center design for the international consulting firm Shen Milsom and Wilke LLC. McFarlane has spent more than 35 years in communications consulting and has experience in every segment of the data center industry.

Dig Deeper on Data center hardware and strategy

SearchWindowsServer
Cloud Computing
Storage
Sustainability
and ESG
Close