Home > Articles

  • Print
  • + Share This
This chapter is from the book

Analyzing Disaster Recovery Strategies

Analyze the existing disaster recovery strategy for client computers, servers, and the network.

As a network infrastructure designer, and particularly if you implement the designs you create, you will want to become intimately familiar with the organization's existing strategy for disaster recovery. The company's existing disaster recovery strategy will become an essential tool for protecting the company's systems and data as you implement your new design.

The company's overall disaster recovery strategy should include the following components:

  • Tape backup strategies

  • Hardware failure recovery strategies

  • Power and other environmental failure recovery strategies

  • Data line and cable failure recovery strategies

  • System code failure recovery strategies

  • Ultimate recovery from acts of God

Tape Backup Strategies

Tape backup strategies are strategies that allow you to recover failures that result in loss of data, such as a file deletion or a hard drive failure. These strategies include the following:

  • Creating and maintaining a tape backup strategy involving full, differential, and/or incremental backups.

  • Creating and maintaining a tape rotation strategy, such as Grandfather-Father-Son (GFS).

  • Storing backup tapes offsite. This is to allow for recoverability in the event of an ultimate act of God, such as a tornado, earthquake, or fire. In designing an offsite storage plan, it is important to strike a balance between keeping recent backup tapes offsite for maximum recoverability in the event of a natural disaster and keeping recent tapes onsite to allow for quick restoration of data.

Hardware Failure Recovery Strategies

Hardware failure recovery strategies involve the strategies and plans that allow the company to recover from a hardware failure, such as a server system board, hard drive, or a hardware switch or router. This strategy is essential to all companies, because each piece of hardware has an estimated "life expectancy"; this is known as the Mean Time Between Failures (MTBF). With hardware components, it is not a matter of "Will you have a failure?" but rather "When will you have a failure?"

The following is a list of the most common hardware-failure recovery strategies:

  • Negotiating Service-Level Agreements (SLAs) with hardware vendors, guaranteeing a certain response time in the event of a hardware failure.

  • Maintaining a spare parts inventory in-house.

  • Building fault tolerance into your hardware infrastructure, such as RAID5 drive arrays, redundant power supplies, or redundant switch configurations.

Power and Other Environmental Failure Recovery Strategies

Power and other environmental failure recovery strategies involve strategies that allow you to recover from events such as power outages or extreme climate conditions. Following is a list of the most common power and other environmental failure recovery strategies:

  • Placing Uninterruptible Power Supplies (UPS) on mission- critical servers for short-term power outages.

  • Maintaining power generators on mission-critical data centers for longer-term power outages.

  • Equipping mission-critical data centers with industrial-quality climate control systems.

Data Line and Cable Failure Recovery Strategies

Data line and cable failure recovery strategies are solutions that allow you to recover from either a failure of a LAN or a WAN connection. This includes the following strategies:

  • Building multiple links to servers.

  • Deploying LAN devices (such as switches) that allow for multiple redundant links between devices.

  • Purchasing multiple WAN links between different geographic sites.

  • Purchasing backup WAN links, such as dial-on-demand links.

  • Maintaining backup remote connectivity solutions, such as RAS servers.

System Code Failure Recovery Strategies

System code failure recovery strategies include strategies that allow you to recover from such failures as a faulty service pack or application upgrade. These strategies include the following:

  • Backing up files prior to applying service packs or application upgrades.

  • Keeping copies of software on hand in order to downgrade to the previous service pack level or application version if necessary.

Ultimate Recovery from Acts of God

Strategies that allow you to recover from ultimate acts of God include strategies that allow you to recover from natural disasters, such as floods, tornados, and earthquakes. These strategies fall into one of two categories:

  • Housing systems in facilities that can withstand the rigors of a natural disaster.

  • Maintaining a disaster recovery site that can be brought online in the event that the original site is struck by a disaster. Data is either restored to systems in the recovery site or is replicated from the original site to the recovery site in real time.

No strategy can guarantee that a company will never experience a failure. Rather, the goal should be to ensure that, if a failure occurs, the amount of downtime is kept within acceptable limits.

You should gather information about the current strategies by interviewing members of the IT staff. Document all the existing disaster recovery plans that the company has in place, or obtain its existing documents if those are sufficient. Review these plans and strategies to ensure that your proposed design won't interrupt or render useless these plans. You may need to make modifications to your design or to the disaster recovery strategies and plans in order to ensure an acceptable level of disaster recovery.

The modifications that you make may include improvement to the existing disaster recovery strategies. Windows 2000 provides a number of redesigned technologies that allow you to enhance your disaster recovery strategies, including the following:

  • Two-way and four-way active clustering

  • Network load balancing

  • An enhanced backup utility

You need to know all the details regarding the processes involved in each of the company's disaster recovery strategies in order to determine the impact of your new network infrastructure design on them. You need to understand these strategies to ensure that these processes remain functional during the implementation of your network infrastructure design.

Case Study: Designing a Network Infrastructure for the Dewey, Winnem, & Howe Law Firm

Essence of the Case

Here are the essential elements of this case:

  • The law firm has many remote sites, each with a small number of users. It will take considerable effort to determine the scope and size of the user population.

  • The firm is concerned with the expense of connectivity between its headquarters and each remote office. An assessment will need to be made of the available methods for connectivity between all the sites.

  • The TCP/IP infrastructure will need be examined to determine if it is optimal. If it is not, the changes that must be made to make it more efficient should be included in the design.

  • The company uses a number of nonstandard proprietary applications. Each of these applications needs to be examined in order to determine its specific needs and how the network infrastructure can support them.

  • Network services, such as DHCP and WINS, will need to be examined and a plan developed in order to extend those services throughout enterprise. The corporate headquarters is using Ethernet hubs for its fundamental network infrastructure; users are complaining of long delays to access the system. The possibility of replacing those hubs with switches should be investigated.

Scenario

The law firm of Dewey, Winnem, & Howe has hired you to design a network infrastructure to upgrade their existing network. The company has headquarters in Manhattan, with several satellite locations scattered across the country. Some of the locations are large offices with many employees. Others are small, partnered law firms with one or two people. The firm would like to maximize its data throughput while minimizing costs. Currently, it is running T-1 connections to every one of its remote offices and is finding that its monthly carrier bill is very expensive. The firm has no centralized network management. Each site has a representative in charge of computer problems on-site. Due to the lack of centralized administration, conflicts often arise. For instance, TCP/IP addresses are assigned statically at each location, and the company often finds that the addresses are duplicated, creating significant troubleshooting issues. The company uses standard office productivity software, such as Microsoft Office, as well as a number of proprietary database applications used to store client case information. Some of its remote offices, especially the smaller ones, have outsourced the computer support to various vendors. The company provides dial-in access to its attorneys so that they can access the client and caseload databases while in courtrooms or on the road. The company also has an Internet connection to access services such as WestLaw and Lexus/Nexus. The existing infrastructure is built on Windows NT 4.0, offering services such as DHCP and WINS through the corporate headquarters in Manhattan. In the corporate headquarters, all the servers are located on the dedicated Ethernet segment. All the workstations are connected to hubs placed on each floor of the office building. Users complain of long delays when they try to access the system during the day.

Analysis

The law firm of Dewey, Winnem, & Howe is a nationwide company with many remote offices. Its user population is widely dispersed. The needs of those users, in terms of applications and connectivity, vary from site to site. When you're designing a new network infrastructure for this company, the existing infrastructure will need to be thoroughly examined. Applications will need to be listed and prioritized and their needs determined. The user population will need to be counted and its needs quantified. End users' work needs and usage patterns will need to be examined in order to determine the appropriate connectivity bandwidth and functionality to be built into the network design. The existing design should be evaluated in terms of disaster recovery. Improvements or modifications to that strategy will need to support the upgrades brought upon by the implementation of the new network infrastructure you design.

  • + Share This
  • 🔖 Save To Your Account