Home > Articles > Information Technology

📄 Contents

  1. Management Reference Guide
  2. Table of Contents
  3. Introduction
  4. Strategic Management
  5. Establishing Goals, Objectives, and Strategies
  6. Aligning IT Goals with Corporate Business Goals
  7. Utilizing Effective Planning Techniques
  8. Developing Worthwhile Mission Statements
  9. Developing Worthwhile Vision Statements
  10. Instituting Practical Corporate Values
  11. Budgeting Considerations in an IT Environment
  12. Introduction to Conducting an Effective SWOT Analysis
  13. IT Governance and Disaster Recovery, Part One
  14. IT Governance and Disaster Recovery, Part Two
  15. Customer Management
  16. Identifying Key External Customers
  17. Identifying Key Internal Customers
  18. Negotiating with Customers and Suppliers—Part 1: An Introduction
  19. Negotiating With Customers and Suppliers—Part 2: Reaching Agreement
  20. Negotiating and Managing Realistic Customer Expectations
  21. Service Management
  22. Identifying Key Services for Business Users
  23. Service-Level Agreements That Really Work
  24. How IT Evolved into a Service Organization
  25. FAQs About Systems Management (SM)
  26. FAQs About Availability (AV)
  27. FAQs About Performance and Tuning (PT)
  28. FAQs About Service Desk (SD)
  29. FAQs About Change Management (CM)
  30. FAQs About Configuration Management (CF)
  31. FAQs About Capacity Planning (CP)
  32. FAQs About Network Management
  33. FAQs About Storage Management (SM)
  34. FAQs About Production Acceptance (PA)
  35. FAQs About Release Management (RM)
  36. FAQs About Disaster Recovery (DR)
  37. FAQs About Business Continuity (BC)
  38. FAQs About Security (SE)
  39. FAQs About Service Level Management (SL)
  40. FAQs About Financial Management (FN)
  41. FAQs About Problem Management (PM)
  42. FAQs About Facilities Management (FM)
  43. Process Management
  44. Developing Robust Processes
  45. Establishing Mutually Beneficial Process Metrics
  46. Change Management—Part 1
  47. Change Management—Part 2
  48. Change Management—Part 3
  49. Audit Reconnaissance: Releasing Resources Through the IT Audit
  50. Problem Management
  51. Problem Management–Part 2: Process Design
  52. Problem Management–Part 3: Process Implementation
  53. Business Continuity Emergency Communications Plan
  54. Capacity Planning – Part One: Why It is Seldom Done Well
  55. Capacity Planning – Part Two: Developing a Capacity Planning Process
  56. Capacity Planning — Part Three: Benefits and Helpful Tips
  57. Capacity Planning – Part Four: Hidden Upgrade Costs and
  58. Improving Business Process Management, Part 1
  59. Improving Business Process Management, Part 2
  60. 20 Major Elements of Facilities Management
  61. Major Physical Exposures Common to a Data Center
  62. Evaluating the Physical Environment
  63. Nightmare Incidents with Disaster Recovery Plans
  64. Developing a Robust Configuration Management Process
  65. Developing a Robust Configuration Management Process – Part Two
  66. Automating a Robust Infrastructure Process
  67. Improving High Availability — Part One: Definitions and Terms
  68. Improving High Availability — Part Two: Definitions and Terms
  69. Improving High Availability — Part Three: The Seven R's of High Availability
  70. Improving High Availability — Part Four: Assessing an Availability Process
  71. Methods for Brainstorming and Prioritizing Requirements
  72. Introduction to Disk Storage Management — Part One
  73. Storage Management—Part Two: Performance
  74. Storage Management—Part Three: Reliability
  75. Storage Management—Part Four: Recoverability
  76. Twelve Traits of World-Class Infrastructures — Part One
  77. Twelve Traits of World-Class Infrastructures — Part Two
  78. Meeting Today's Cooling Challenges of Data Centers
  79. Strategic Security, Part One: Assessment
  80. Strategic Security, Part Two: Development
  81. Strategic Security, Part Three: Implementation
  82. Strategic Security, Part Four: ITIL Implications
  83. Production Acceptance Part One – Definition and Benefits
  84. Production Acceptance Part Two – Initial Steps
  85. Production Acceptance Part Three – Middle Steps
  86. Production Acceptance Part Four – Ongoing Steps
  87. Case Study: Planning a Service Desk Part One – Objectives
  88. Case Study: Planning a Service Desk Part Two – SWOT
  89. Case Study: Implementing an ITIL Service Desk – Part One
  90. Case Study: Implementing a Service Desk Part Two – Tool Selection
  91. Ethics, Scandals and Legislation
  92. Outsourcing in Response to Legislation
  93. Supplier Management
  94. Identifying Key External Suppliers
  95. Identifying Key Internal Suppliers
  96. Integrating the Four Key Elements of Good Customer Service
  97. Enhancing the Customer/Supplier Matrix
  98. Voice Over IP, Part One — What VoIP Is, and Is Not
  99. Voice Over IP, Part Two — Benefits, Cost Savings and Features of VoIP
  100. Application Management
  101. Production Acceptance
  102. Distinguishing New Applications from New Versions of Existing Applications
  103. Assessing a Production Acceptance Process
  104. Effective Use of a Software Development Life Cycle
  105. The Role of Project Management in SDLC— Part 2
  106. Communication in Project Management – Part One: Barriers to Effective Communication
  107. Communication in Project Management – Part Two: Examples of Effective Communication
  108. Safeguarding Personal Information in the Workplace: A Case Study
  109. Combating the Year-end Budget Blitz—Part 1: Building a Manageable Schedule
  110. Combating the Year-end Budget Blitz—Part 2: Tracking and Reporting Availability
  111. References
  112. Developing an ITIL Feasibility Analysis
  113. Organization and Personnel Management
  114. Optimizing IT Organizational Structures
  115. Factors That Influence Restructuring Decisions
  116. Alternative Locations for the Help Desk
  117. Alternative Locations for Database Administration
  118. Alternative Locations for Network Operations
  119. Alternative Locations for Web Design
  120. Alternative Locations for Risk Management
  121. Alternative Locations for Systems Management
  122. Practical Tips To Retaining Key Personnel
  123. Benefits and Drawbacks of Using IT Consultants and Contractors
  124. Deciding Between the Use of Contractors versus Consultants
  125. Managing Employee Skill Sets and Skill Levels
  126. Assessing Skill Levels of Current Onboard Staff
  127. Recruiting Infrastructure Staff from the Outside
  128. Selecting the Most Qualified Candidate
  129. 7 Tips for Managing the Use of Mobile Devices
  130. Useful Websites for IT Managers
  131. References
  132. Automating Robust Processes
  133. Evaluating Process Documentation — Part One: Quality and Value
  134. Evaluating Process Documentation — Part Two: Benefits and Use of a Quality-Value Matrix
  135. When Should You Integrate or Segregate Service Desks?
  136. Five Instructive Ideas for Interviewing
  137. Eight Surefire Tips to Use When Being Interviewed
  138. 12 Helpful Hints To Make Meetings More Productive
  139. Eight Uncommon Tips To Improve Your Writing
  140. Ten Helpful Tips To Improve Fire Drills
  141. Sorting Out Today’s Various Training Options
  142. Business Ethics and Corporate Scandals – Part 1
  143. Business Ethics and Corporate Scandals – Part 2
  144. 12 Tips for More Effective Emails
  145. Management Communication: Back to the Basics, Part One
  146. Management Communication: Back to the Basics, Part Two
  147. Management Communication: Back to the Basics, Part Three
  148. Asset Management
  149. Managing Hardware Inventories
  150. Introduction to Hardware Inventories
  151. Processes To Manage Hardware Inventories
  152. Use of a Hardware Inventory Database
  153. References
  154. Managing Software Inventories
  155. Business Continuity Management
  156. Ten Lessons Learned from Real-Life Disasters
  157. Ten Lessons Learned From Real-Life Disasters, Part 2
  158. Differences Between Disaster Recovery and Business Continuity , Part 1
  159. Differences Between Disaster Recovery and Business Continuity , Part 2
  160. 15 Common Terms and Definitions of Business Continuity
  161. The Federal Government’s Role in Disaster Recovery
  162. The 12 Common Mistakes That Cause BIAs To Fail—Part 1
  163. The 12 Common Mistakes That Cause BIAs To Fail—Part 2
  164. The 12 Common Mistakes That Cause BIAs To Fail—Part 3
  165. The 12 Common Mistakes That Cause BIAs To Fail—Part 4
  166. Conducting an Effective Table Top Exercise (TTE) — Part 1
  167. Conducting an Effective Table Top Exercise (TTE) — Part 2
  168. Conducting an Effective Table Top Exercise (TTE) — Part 3
  169. Conducting an Effective Table Top Exercise (TTE) — Part 4
  170. The 13 Cardinal Steps for Implementing a Business Continuity Program — Part One
  171. The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Two
  172. The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Three
  173. The 13 Cardinal Steps for Implementing a Business Continuity Program — Part Four
  174. The Information Technology Infrastructure Library (ITIL)
  175. The Origins of ITIL
  176. The Foundation of ITIL: Service Management
  177. Five Reasons for Revising ITIL
  178. The Relationship of Service Delivery and Service Support to All of ITIL
  179. Ten Common Myths About Implementing ITIL, Part One
  180. Ten Common Myths About Implementing ITIL, Part Two
  181. Characteristics of ITIL Version 3
  182. Ten Benefits of itSMF and its IIL Pocket Guide
  183. Translating the Goals of the ITIL Service Delivery Processes
  184. Translating the Goals of the ITIL Service Support Processes
  185. Elements of ITIL Least Understood, Part One: Service Delivery Processes
  186. Case Study: Recovery Reactions to a Renegade Rodent
  187. Elements of ITIL Least Understood, Part Two: Service Support
  188. Case Studies
  189. Case Study — Preparing for Hurricane Charley
  190. Case Study — The Linux Decision
  191. Case Study — Production Acceptance at an Aerospace Firm
  192. Case Study — Production Acceptance at a Defense Contractor
  193. Case Study — Evaluating Mainframe Processes
  194. Case Study — Evaluating Recovery Sites, Part One: Quantitative Comparisons/Natural Disasters
  195. Case Study — Evaluating Recovery Sites, Part Two: Quantitative Comparisons/Man-made Disasters
  196. Case Study — Evaluating Recovery Sites, Part Three: Qualitative Comparisons
  197. Case Study — Evaluating Recovery Sites, Part Four: Take-Aways
  198. Disaster Recovery Test Case Study Part One: Planning
  199. Disaster Recovery Test Case Study Part Two: Planning and Walk-Through
  200. Disaster Recovery Test Case Study Part Three: Execution
  201. Disaster Recovery Test Case Study Part Four: Follow-Up
  202. Assessing the Robustness of a Vendor’s Data Center, Part One: Qualitative Measures
  203. Assessing the Robustness of a Vendor’s Data Center, Part Two: Quantitative Measures
  204. Case Study: Lessons Learned from a World-Wide Disaster Recovery Exercise, Part One: What Did the Team Do Well
  205. (d) Case Study: Lessons Learned from a World-Wide Disaster Recovery Exercise, Part Two

Most operations managers do a reasonable job at keeping their data centers up and running. Many shops go for years without a experiencing a major outage specifically caused by the physical environment. But the infrequent nature of these types of outages can often lull managers into a false sense of security and lead them to overlook the risks to which they may be exposed. Figure 1 lists the most common of these.

  1. Physical wiring diagrams out of date
  2. Logical equipment configuration diagrams and schematics out of date
  3. Infrequent testing of UPS
  4. Failure to recharge UPS batteries
  5. Failure to test generator and fuel levels
  6. Lack of preventive maintenance on air conditioning equipment
  7. Alarm/alert system not tested
  8. Fire suppression system not recharged
  9. Fire suppression system not inspected
  10. Emergency power-off system not tested
  11. Emergency power-off system not documented
  12. Infrequent testing of backup generator system
  13. Equipment not properly anchored
  14. Evacuation procedures not clearly documented
  15. Circumvention of physical security procedures
  16. Lack of effective training to appropriate personnel

Figure 1 Major Physical Exposures Common to a Data Center

The older the data center, the greater these exposures become. But the relative newness of a data center does not ensure a shop will not have outages. I have had clients who collectively have experienced at least half of these exposures during the past three years. Many of their data centers were less than 10 years old.

Preventative maintenance, testing, inspections, or any combination of these should occur once a year at a minimum. I have worked with some shops that have annual maintenance contracts in place for their physical facilities, including onsite inspections, but choose not to exercise them. Untested safeguards, un-inspected equipment, undocumented procedures and untrained staff are all invitations to disaster that are easily preventable.

A recent personal experience at a financial services client of mine serves to illustrate the importance of regularly scheduled inspections. One of the most critical departments of this firm was housed in a modern office building owned by another company. I was performing business continuity work and was included to be alerted of any unusual occurrences of business disruption. One day around 2:30pm I received a call saying that sprinklers had activated on the first floor of this building over the area where the department in question was located.

I responded to the scene and found the building had been evacuated in an orderly manner per procedure and that the fire department was already onsite. They discovered that a faulty sprinkler head had been accidentally bumped and broken off. What was most surprising to me was that the water that ejected - some 600 gallons worth – was not of the pure, clear variety you might expect. It was filled with black, sooty particles. It turned out that the sprinkler system had not been inspected since its installation some four years prior. During this time it had had accumulated huge amounts of rust, corrosion and other particulate matter. This is what spewed down over a 500 square foot area of open cubicles, much to the surprise of the 18 employees who were sitting under it at the time. Needless to say, regular inspections were immediately intiated.

Tips to Minimize Data Center Exposures

There are a number of simple actions that can be taken to minimize data center exposures (shown in Figure 2). Establishing good relationships with key support departments such as the facilities department and local government inspecting agencies can help keep maintenance and expansion plans on schedule. This can also lead to a greater understanding of what the infrastructure group can do to enable both of these agencies to better serve the IT department.

  1. Nurture relationships with facilities department.
  2. Establish relationships with local government inspecting agencies, especially if considering major physical upgrades to the data center.
  3. Consider using video cameras to enhance physical security.
  4. Analyze environmental monitoring reports to identify trends, patterns, and relationships.
  5. Check on effectiveness of water and fire detection and suppression systems.
  6. Remove all tripping hazards in the computer center.
  7. Check on earthquake preparedness of data center (devices anchored down, training of personnel, tie-in to disaster recovery).

Figure 2 Tips to Improve Facilities Management

Video cameras have been around for a long time to enhance and streamline physical security, but their condition is occasionally overlooked. Cameras must be checked periodically to make sure that the recording and playback mechanism is in good shape and that the tape is of sufficient quality to ensure reasonably good playback.

Environmental recording device also must be checked periodically. Many of these devices are quite sophisticated; they collect a wealth of data about temperature, humidity, purity of air, hazardous vapors, and other environmental measurements. The data is only as valuable as the effort expended to analyze it for trends, patterns, and relationships. A reasonably thorough analysis should be done on this type of data quarterly.

In my experience, most shops do a good job of periodically testing their backup electrical systems such as UPS, batteries, generators, and power distribution units (PDUs), but not such a good job on fire detection and suppression systems. This is partly due to the huge capital investment companies make into their electrical backup systems—managers want to ensure a good return on such a sizable outlay of cash. Maintenance contracts for these systems frequently include inspection and testing, at least at the outset. However, this is seldom the case with fire detection and suppression systems. Infrastructure personnel need to be proactive in this regard by insisting on regularly scheduled inspection and maintenance of these systems, as well as up-to-date evacuation plans.

One of the simplest actions to take to improve a computer center's physical environment is to remove all tripping hazards. While this sounds simple and straightforward, it is often neglected in favor of equipment moves, hardware upgrades, network expansions, general construction, and—one of the most common of all—temporary cabling that ends up being semi-permanent. This is not only unsightly and inefficient; it can be outright dangerous as physical injuries become a real possibility. Operators and other occupants of the computer center should be trained and authorized to keep the environment efficient, orderly, and safe.

The final tip is to make sure the staff is trained and practiced on earthquake preparedness, particularly in geographic areas most prone to this type of disaster. Common practices such as anchoring equipment, latching cabinets, and properly storing materials should be verified by qualified individuals several times per year.

A Word about Efficiency and Effectiveness

In addition to ensuring a stabile physical environment, the facilities management process owner has another responsibility that is sometimes overlooked. The process owner must ensure efficiencies are designed into the physical layout of the computer facility. A stable and reliable operating environment will result in an effective data center. Well-planned physical layouts will result in an efficient one. Analyzing the physical steps that operators take to load and unload printers, to relocate tapes, to monitor consoles, and to perform other routine physical tasks can result in a well-designed floor plan that minimizes time and motion and maximizes efficiency.

One other point to consider in this regard is the likelihood of expansion. Physical computer centers, not unlike IT itself, are an ever changing entity. Factoring in future expansion due to capacity upgrades, possible mergers, or departmental reorganizations can assist in keeping current floor plans efficient in the future.

References

Schiesser, Rich, IT Systems Management, Prentice Hall, 2002

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.