Preventing Outages: Managing the Exposures and Efficiencies of a Physical Data Center
Major Physical Exposures Common to a Data Center
Most operations managers do a reasonable job of keeping their data centers up and running. Many shops go for years without experiencing a major outage specifically caused by the physical environment. But the infrequent nature of these types of outages can often lull managers into a false sense of security and lead them to overlook the risks to which they may be exposed. The following list shows 15 of the most common risks:
Physical wiring diagrams out of date
Logical equipment configuration diagrams and schematics out of date
Infrequent testing of UPS
Failure to recharge UPS batteries
Failure to test generator and fuel levels
Lack of preventive maintenance on air conditioning equipment
Annunciator system ("announces" alarm conditions) not tested
Fire suppression system not recharged
Emergency power-off system not tested
Emergency power-off system not documented
Infrequent testing of backup generator system
Equipment not properly anchored to guard against earthquakes and other disasters
Evacuation procedures not clearly documented
Circumvention of physical security procedures
Lack of effective training for appropriate personnel
The older the data center, the greater these exposures become. I've had clients who collectively have experienced at least half of these exposures during the past three years. Many of their data centers were less than 10 years old.
Preventive maintenance, testing, inspections, or any combination of these measures should occur once a year at a minimum. I've worked with some shops that have annual maintenance contracts in place for their physical facilities, including onsite inspections, but chose not to exercise them. Untested safeguards, uninspected equipment, undocumented procedures, and untrained staff are all preventable invitations to disaster.
A Word About Outsourcing
Outsourcing can cause another, non-physical types of exposure to a data center. Shops that outsource portions of their infrastructure servicescollocation of servers is an exampleoften feel that the responsibility for the facilities management process is also outsourced and no longer their concern. While the outsourcer has a direct responsibility for providing a stable physical environment, the client has an indirect responsibility to ensure that this will occur. During the evaluation of bids and in contract negotiations, appropriate infrastructure personnel should ask the same types of questions about the outsourcer's physical environment that they would if it were their own computer center.