WP 40 Cooling Audit For Identifying Potential Cooling Problems in Data Centers
WP 40 Cooling Audit For Identifying Potential Cooling Problems in Data Centers
WP 40 Cooling Audit For Identifying Potential Cooling Problems in Data Centers
by Kevin Dunlap
white papers are now part of the Schneider Electric white paper library
produced by Schneider Electrics Data Center Science Center
DCSC@Schneider-Electric.com
Introduction
There are significant benefits from the compaction of technical equipment and simultaneous
advances in processor power. However, this has also created potential challenges for those
responsible for delivering and maintaining proper mission-critical environments. While the
overall total power and cooling capacity designed for a data center may be adequate, the
distribution of cool air to the right areas may not. When more compact IT equipment is
housed densely within a single cabinet, or when data center managers contemplate largescale deployments with multiple racks filled with ultracompact blade servers, the increased
power required and heat dissipated must be addressed. Blade servers, as seen in Figure 1,
take up far less space than traditional rack-mounted servers and offer more processing ability
while consuming less power per server. However, they dramatically increase heat density.
Figure 1
Examples of compaction
In designing the cooling system of a data center the objective is to create an unobstructed
path from the source of the cooled air to the inlet positions of the servers. Likewise, a clear
path needs to be created from the rear exhaust of the servers to the return air duct of the airconditioning unit. A number of factors that can adversely impact this objective.
In order to ascertain that there is a problem or potential problem with the cooling infrastructure of a data center, certain checks and measurements must be carried out. This audit will
determine the health of the data center in order to avoid temperature-related electronic
equipment failure. They can also be used to evaluate the availability of adequate cooling
capacity for the future. Measurements in the described tests should be recorded and
analyzed using the template provided in the Appendix. The current status should be
assessed and a baseline established to ensure that subsequent corrective actions result in
improvements. This paper shows how to identify potential cooling problems in existing data
centers that will affect the total cooling capacity, the cooling density capacity, and the
operating efficiency of a data center. Solutions to these problems are described in White
Paper 42, Ten Cooling Solutions to Support High-Density Server Deployment.
Check capacity
Remembering that each watt of IT power requires 1 watt of cooling, the first step toward
providing adequate cooling is to verify that the capacity of the cooling system matches the
current and planned power load.
The typical cooling system is comprised of a CRAC (Computer Room Air Conditioner) to
deliver the cooled air to the room and a unit mounted externally to reject the heat to atmosphere. For more information on how air conditioners work and to learn about the different
types, please refer to White Paper 57, Fundamental Principles of Air Conditioners for
White Paper 40
Rev 3
Table 1
Data center or network room heat output calculation worksheet
Item
Data required
IT equipment
_____________ watts
_____________ watts
Power distribution
_____________ watts
Lighting
People
_____________ watts
Total
_____________ watts
_____________ watts
White Paper 40
Rev 3
Procedure
Obtain the information required in the Data required column. Consult the data definitions
below in case of questions. Perform the heat output calculations and put the results in the
subtotal column. Add the subtotals to obtain the total heat output.
Data definitions
Total IT load power in watts - The sum of the power inputs of all the IT equipment.
Power system rated power - The power rating of the UPS system. If a redundant system is
used, do not include the capacity of the redundant UPS.
Check CRACs
If CRAC units in a data center do not work together in a coordinated fashion they are likely to
fall short of their cooling capacity and incur a higher operating cost. CRAC units normally
operate in four modes: cooling, heating, humidification and dehumidification. While two of
these conditions may occur at the same time (i.e., cooling and dehumidification), all systems
within a defined area (4-5 units adjacent to one another) should always be operating in the
same mode. Uncoordinated CRAC units operating in opposing modes (i.e. dehumidifying
and humidifying), called demand fighting, leads to wasted operating costs and a reduction in
the cooling capacity. CRAC units should be tested to ensure that measured temperatures
(supply & return) and humidity readings are consistent with design values.
Demand fighting can have drastic effects on the efficiency of the CRAC system. If not
addressed, this problem can result in a 20-30% reduction in efficiency which in the best case
results in wasted operating costs and worst case results in downtime due to insufficient
cooling capacity.
Operation of the system within lower limits of the relative humidity design parameters should
be considered for efficiency and cost savings. A slight change in set point toward the lower
end of the range can have a dramatic effect on the heat removal capacity and reduction in
humidifier run time. As seen in Table 2, changing the relative humidity set point from 50% to
45% results in a significant operational cost savings.
The position of the CRAC units relative the aisle is important for air distribution. Depending
on the air distribution architecture, CRAC units should be placed perpendicular to the aisle on
either a cold or hot aisle as shown in Figure 2. When using a raised floor for distribution, the
CRAC units should be placed at the end of the hot aisles. The hot air return path to the
CRAC is directly down the aisle without pulling air over the tops of aisles where the opportunity for air to be re-circulated is increased. With less mixing of the hot air in the room, the
capacity of the CRAC units will be increased by warmer return air temperatures. This could
potentially lead to a requirement for fewer units in the room.
White Paper 40
Rev 3
CRAC
COLD AISLE
HOT AISLE
CRAC
COLD AISLE
COLD AISLE
Figure 2
HOT AISLE
CRAC
CRAC
When a slab floor is used, the CRAC should be placed at the end of the cold aisle. This will
distribute the supply air to the front of the cabinets. Some mixing will exist in this configuration and it should be implemented only when low power densities per rack exist.
50%
45%
48.6 (166,000)
49.9 (170,000)
Table 2
45.3 (155,000)
49.9 (170,000)
Humidification cost
savings example at
lower set point
Humidification requirement
Moisture removed (total latent capacity) (Btu / hr)
3.3 (11,000)
0.0 (0,000)
10.24 (4.6]
100.0%
0.0%
3.2
$2,242.56
$0.00
Humidifier runtime
kW required for humidification
Annual cost of humidification (cost per kW x 8760 x
kW required)
Note: Assumptions and specifications for the example above can be found in the Appendix.
White Paper 40
Rev 3
oints
itor p )
n
o
M
rn
(Retu
Figure 3
oints
itor p )
n
o
M
ply
(Sup
oints
tor p
Moni turn)
(Re
In ideal conditions the supply air temperature should be set to the inlet temperature required
at the server inlet. This will be checked later by taking temperature readings at the server
inlets. The return air temperature measured should be greater than or equal to the temperature readings from racks and aisles. A lower return air temperature than the temperature in
racks and aisles indicates short cycling inefficiencies. Short cycling occurs when the cool
supply air from the CRAC unit bypasses the IT equipment and flows directly into the CRAC
unit air return duct. See White Paper 49, Avoidable Mistakes that Compromise Cooling
Performance in Data Centers and Network Rooms for information on preventing short cycling.
The bypass of cool air is the biggest cause of overheating and can be caused by a number of
factors.
Also, verify that the filters are clean. Impeded airflow through the CRAC will cause the
system to shutdown on loss of airflow alarm. Filters should be changed quarterly as a
preventative maintenance procedure.
White Paper 40
Rev 3
Test main
cooling circuit
This section requires an understanding of basic air condition equipment. For more information on this read White Paper 59, The Different Types of Air Conditioning Equipment for IT
Environments. Get your maintenance company or an independent HVAC consultant to check
the condition of the chillers (where applicable), pumping systems and primary cooling loops.
Ensure that all valves are operating correctly.
Chilled water
Condenser water
(water cooled)
Condenser water
(glycol cooled)
Max 90F
Max 110F
Max 32.2C
Max 43.3C
Table 3
Supply loop temperature
tolerances
White Paper 40
Rev 3
Record rack
and aisle
temperatures
Figure 4
ASHRAE TC9.9 hot aisle / cold
aisle measurement points
Reprinted with permission ASHRAE 2004. (c) American Society of Heating, Refrigerating and AirConditioning Engineers, Inc., www.ashrae.org.
Aisle temperature measurement points should be 5 feet (1.5 meters) above the floor. When
more sophisticated means of measuring the aisle temperatures are not available this should
be considered a minimal measurement. These temperatures should be recorded and
compared with the IT equipment manufacturers recommended inlet temperatures. When the
recommended inlet temperatures of IT equipment are not available, 68-75F (20-25C)
should be used in accordance to the ASHRAE standard. Temperatures outside this tolerance
can lead to a reduction in system performance, reduced equipment life and unexpected
downtime. Note: All the above checks and tests should be carried out quarterly. Temperature checks should be carried out over a 48-hour period during each test to record maximum
and minimum levels.
Poor air distribution to the front of a rack can cause the hot exhaust air from the equipment to
recirculate back into the intakes. This causes some equipment, typically those mounted
toward the top of the rack, to overheat and shutdown or fail. This step is to verify that the
bulk inlet temperatures in the rack are adequate for the equipment installed. Take and record
temperatures at the geometric center of the rack front at bottom, middle and top as illustrated
in Figure 5. When the rack is not fully populated with equipment, measure inlet temperatures
at the geometric center of each piece of equipment. Refer to the guidelines under check
CRACs for acceptable inlet temperatures. Temperatures not within the guidelines represent
a cooling problem for that monitoring point.
Monitoring points should be 2 inches (50 mm) off the face of the rack equipment. Monitoring
can be accomplished with thermocouples connected to a data collection device. Monitoring
1
ASHRAE Standard TC9.9 gives more details of positioning sensors for optimum testing and recommended inlet temperatures. ASHRAE (American Society of Heating, Refrigeration and Air-Conditioning
Engineers http://www.ashrae.org)
White Paper 40
Rev 3
Monitor points
Figure 5
ASHRAE monitoring points for
equipment inlet temperatures
Reprinted with permission ASHRAE 2004. (c) American Society of Heating, Refrigerating and AirConditioning Engineers, Inc., http://www.ashrae.org.
Check airflow
from floor grilles
It is important to understand that the cooling capacity of the cabinet is directly related to the
airflow volume delivery stated in CFM (cubic feet per minute). IT equipment is designed to
raise the temperature of the supply air by 20-30F (11-17C). Using the equation for heat
removal, the amount of airflow required at a given temperature rise can be quickly computed.
CFM or m3/s = the volume of airflow required to remove the heat generated by IT equipment
Q = the amount of heat to be removed expressed in kilowatts (kW)
F or C = the exhaust air temperature of the equipment minus the intake temperature
CFM =
3,412 Q
1.085 F
m3 / s =
Q
1.21 C
For example, to calculate the airflow required to cool a 1 kW server with a 20F temperature
rise:
CFM =
3,412 1kW
= 157.23
1.085 20F
m3 / s =
1kW
= 0.0742
1.21 11C
Therefore, for every 1 kW of heat removal required at a design DeltaT (temperature rise
through IT equipment) of 20F (11C) you must supply approximately 160 cubic feet per
minute (0.076 m3/s or 75.5 L/s) of conditioned air through the equipment. When calculating
the necessary airflow requirement per rack, this can be used as an approximated design
White Paper 40
Rev 3
(m 3 / s ) / kW = 0.074
CFM / kW = 157.23
( L / s ) / kW = 74.2
Using the design value and the typical tile (~ 25% open) airflow capacity shown in Figure 5
below, the max power density per cabinet should be 1.25 to 2.5 kW per cabinet. This is
applicable to installations utilizing one tile per cabinet. In instances where cabinet to floor tile
ratio is greater than one, the available cooling capacity should be divided among the cabinets
in the row.
Figure 6
Available rack
enclosure cooling
capacity of a floor tile
as a function of pertile airflow
Measuring the amount of available cooling capacity on a given floor tile can be accomplished
simply laying a small piece of paper on it. If the paper gets sucked into the floor tile this
means that air is being drawn back under the raised floor which indicates a problem with the
rack and CRAC positioning. If the paper is unaffected it could be that there is not air getting
to that tile. If the paper moves up off the floor tile this is an indication that air is being
distributed from that tile. However, depending on the power density of the equipment being
cooled, the amount of air from the tile may not be enough. In this case a grate or air distribution device may be required to allow more air to flow to the front of the racks.
With
Effort
Typical
Capability
Extreme
Impractical
5
4
3
2
1
0
0
100
[47.2]
200
[94.4]
300
[141.6]
400
[188.8]
500
[236.0]
600
[283.2]
700
[330.4]
800
[377.6]
900
[424.8]
1000
[471.9]
Inspect
enclosures
Unused vertical space within rack enclosures causes the hot air output from equipment to
take a short circuit back to the inlet of the equipment. This unrestricted cycling of hot air
causes the equipment to heat up unnecessarily which can lead to equipment damage or
downtime. The use of blanking panels to combat this effect is described in more detail in
White Paper 44, Improving Rack Cooling Performance Using Blanking Panels. Visually
examine each rack. Are there any gaps in the u positions? Are CRT monitors being used?
Are blanking panels installed in these racks? Is an excess of cabling impeding the airflow?
If there are visible gaps in the U space positions, blanking panels are not installed or there is
excessive cabling in the rear of the rack, then airflow within the rack will not be optimal as
illustrated in Figure 7.
White Paper 40
Rev 3
10
Figure 7
Side
Side
7A (left)
Blanking Panel
7B (right)
With blanking panels
Check sub-floors for cleanliness and / or obstructions. Any dirt and dust present below the
raised floor will be blown up through floor grills and will be drawn into the IT equipment. Floor
obstructions such as network and power cables will obstruct airflow and have a negative
effect on the cooling supply to the racks.
Subsequent addition of racks and servers will result in the installation of more power and
network cabling. Often, when servers and racks are moved or replaced, the redundant
cabling is left beneath the floor.
A visual inspection of the floor surface should be conducted when a raised floor is utilized for
air distribution. Voids, gaps and missing floor tiles have a damaging effect on the static
pressure of the floor plenum. The ability to maintain airflow rates from perforated floor tiles
will be diminished with the presence of unsealed areas on the raised flooring.
Missing floor tiles should be replaced. The floor should consist of solid or perforated floor
tiles in every section of the grid. Holes in the raised flooring tiles used for cabling access
should be sealed using brush strips or other cable access products. Measurements
conducted show that 50-80% of available cold air escapes prematurely through unsealed
cable openings.
With few exceptions, most rack-mounted servers are designed to draw air in at the front and
exhaust at the back. With all the racks facing the same way in a row, the hot air from row
one is exhausted into the aisle where it will mix with supply or room air and then enter into
the front of the racks in row two. This arrangement is shown in Figure 8. As air passes
through each consecutive row the IT equipment is subjected to hotter intake air. If all the
rows have the cabinets arranged so that the inlets of the servers face the same direction
equipment malfunction is imminent.
White Paper 40
Rev 3
11
Figure 8
Rack arrangement with no
separation of hot or cold aisles
Configuring the rack in a hot aisle / cold aisle configuration will separate the exhaust air from
the server inlets. This will allow the cold supply air from the floor tiles to enter into the
cabinets with less mixing as illustrated in Figure 9 below. For more on air distribution
architectures in the data center refer to White Paper 55, Air Distribution Architecture for
Mission Critical Facilities.
Figure 9
Hot aisle / cold aisle rack
arrangement
Improper location of these vents can cause CRAC air to mix with hot exhaust air before
reaching the load equipment, giving rise to the cascade of performance problems and costs
described previously. Poorly located delivery or return vents are very common and can erase
almost all of the benefit of a hot aisle / cold aisle design.
White Paper 40
Rev 3
12
Conclusion
Routine checks of a data centers cooling system can identify potential cooling problems early
on to help prevent downtime. Changes in power consumption, IT refreshes and growth can
change the amount of heat produced in the data center. Regular health checks will most
likely identify the impact of these changes before they become a major issue. Achieving the
proper environment for a given power density can be accomplished by addressing the
problems identified through the health checks provided in this white paper. For more
information on cooling solutions for higher power densities refer to White Paper 42, Ten
Cooling Solutions to Support High-Density Server Deployment.
White Paper 40
Rev 3
13
Resources
Ten Cooling Solutions to Support High-Density
Server Deployment
White Paper 42
Browse all
white papers
whitepapers.apc.com
Browse all
TradeOff Tools
tools.apc.com
Contact us
For feedback and comments about the content of this white paper:
Data Center Science Center
dcsc@schneider-electric.com
If you are a customer and have questions specific to your data center project:
Contact your Schneider Electric representative at
www.apc.com/support/contact/index.cfm
White Paper 40
Rev 3
14
Appendix
White Paper 40
Rev 3
15
Figure A1
Audit checklist
Model
Unit 1
Unit 2
Unit 3
Unit 4
Unit 5
Unit 6
Unit 7
Unit 8
Unit 9
Unit 10
Total Capacity
Sensible Capacity
Quantity
Power Distribution
Lighting
People
Total
Yes
No
Acceptable
Meets Tolerance (check one)
Averages: Temp. 68All within range
75F (20-25C),
Humidity 40-55% 1-2 out of range
>2 out of range
R.H.
Acceptable
Averages: Temp. 5865F (14-18C)
Cooling Circuits
Chilled Water
Condenser Water - Water Cooled
Condenser Water - Glycol Cooled
Air Cooled
Yes
No
Yes
No
Yes
No
Aisle Temperatures
Measurement points at 5 feet (1.5 meters) above the floor at every 4th rack (averaged for aisle)
Aisle 1 ________
Aisle 6 ________
Acceptable
Aisle 2 ________
Aisle 7 ________
Averages: Temp. 68Aisle 8 ________
Aisle 3 ________
75F (20-25C)
Aisle 9 ________
Aisle 4 ________
Aisle 10________
Aisle 5 ________
White Paper 40
Rev 3
16
Figure A2
Audit checklist (cont.)
Rack Temperatures
Measurement points at 5 feet (1.5 meters) above the floor at every 4th rack (averaged for aisle)
R1 ____ R2 ____ R3 ____
R46 ____ R47____ R48 ____
R4 ____ R5 ____ R6 ____
R49 ____ R50____ R51 ____
R7 ____ R8 ____ R9 ____
R52 ____ R53____ R54 ____
R10 ____ R11____ R12 ____
R55 ____ R56____ R57 ____
Acceptable
R13 ____ R14____ R15 ____
R58 ____ R59____ R60 ____
Averages: Temp. 68R16 ____ R17____ R18 ____
R61 ____ R62____ R63 ____
75F (20-25C), Top
R19 ____ R20____ R21 ____
R64 ____ R65____ R66 ____
to bottom
R22 ____ R23____ R24 ____
R67 ____ R68____ R69 ____
temperatures in
R25 ____ R26____ R27 ____
R70 ____ R71____ R72 ____
each rack should not
R28 ____ R29____ R30 ____
R73 ____ R74____ R75 ____
differ more than 5F
R31 ____ R32____ R33 ____
R76 ____ R77____ R78 ____
(2.8C)
R34 ____ R35____ R36 ____
R79 ____ R80____ R81 ____
R37 ____ R38____ R39 ____
R82 ____ R83____ R84 ____
R40 ____ R41____ R42 ____
R85 ____ R86____ R87 ____
R88 ____ R89____ R90 ____
R43 ____ R44____ R45 ____
Airflow
Check all perforated tiles (where applicable), compare to tolerances
Airflow measurement (positive airflow check),
volume tests should be carried out by a qualified
HVAC contractor
Acceptable
Averages: => 160
cfm/kW
(75.5 L/s) / kW
Blanking Panels
Meets Tolerance
(check one)
Yes
No
Yes
No
Yes
No
Yes
No
Yes
No
Yes
No
Are blanking panels installed in all rack spaces where IT equipment is not
installed?
Are all floor tiles in place? Are cable access openings adequately sealed?
Meets Tolerance
(check one)
Are blanking panels installed in all rack spaces where IT equipment is not
Do the CRACs line up with the hot aisles?
Is there separation between hot and cold aisles (racks not facing the same
direction)?
Meets Tolerance
(check one)
White Paper 40
Rev 3
17