Home > Articles > Networking > Storage

  • PrintPrint
  • Share ThisShare This
  • DiscussDiscuss
Close Window

Sun Microsystems 

Learn more…

IPsec -- A Secure Deployment Option
Sep 24, 2004
Using pGINA to Authenticate Users in Microsoft Windows Environments
Aug 27, 2004
Best Practices for Deploying the Sun StorADE Utility
Aug 20, 2004
Performing Network Solaris Installations Without a Local Boot Server
Aug 13, 2004
Using Solaris Resource Manager With Sun Ray
Aug 6, 2004
N1 Grid Architecture Realized: Strategic Flexibility
Jul 16, 2004
Global Grid Connectivity Using Globus Toolkit With Solaris Operating System
Jun 25, 2004
Building a Bootable DVD to Deploy a Solaris Flash Archive
Jun 18, 2004
Building OpenSSH--Tools and Tradeoffs, Updated for OpenSSH 3.7.1p2
Jun 18, 2004
Maximizing the Performance a Gigabit Ethernet NIC Interface
Jun 18, 2004
Dynamic Reconfiguration for High-End Servers: Part 2--Implementation Phase
Jun 11, 2004
Supporting Multiple Page Sizes in the Solaris Operating System Appendix
Jun 11, 2004
Dynamic Reconfiguration for High-End Servers: Part 1 --- Planning Phase
Jun 4, 2004
Supporting Multiple Page Sizes in the Solaris Operating System
Jun 4, 2004
Data Center Best Practices for High-End Servers
May 28, 2004
Understanding Tuning TCP
May 28, 2004
Sun Ray Deployment On Shared Networks
Apr 30, 2004
LDAP Triggers: A Framework for Sun Java System Directory Server
Apr 23, 2004
Taming Your Emu to Improve Application Performance
Apr 23, 2004
Best Practices for Deploying the Sun StorADE Utility
Apr 16, 2004
Sun Fire 15K/12K Auto Diagnosis and Recovery
Apr 16, 2004
Dynamic Reconfiguration and Oracle 9i Dynamically Resizeable SGA
Apr 9, 2004
Solaris Operating System Availability Features
Apr 2, 2004
Design, Features, and Applicability of Solaris File Systems
Mar 26, 2004
Securing the Sun Fire 12K/15K System Controller
Mar 19, 2004
Securing the Sun Fire 12K/15K Domains
Mar 12, 2004
Enterprise Network Design Patterns: High Availability
Feb 20, 2004
Performance Forensics
Feb 13, 2004
Migrating to the Solaris Operating System: Migrating From Tru64 UNIX
Feb 6, 2004
Tuning ORACLE to Minimize Recovery Time: For Solaris Operating System on SPARC
Feb 6, 2004
Securing Linux Systems With Host-Based Firewalls Implemented With Linux iptables
Jan 30, 2004
Securing Web Applications through a Secure Reverse Proxy
Jan 30, 2004
Hardware Replication Challenges
Jan 23, 2004
Solaris Volume Manager Performance Best Practices
Jan 23, 2004
Sun Fire 6800/4810/4800/3800 Systems Auto Diagnosis and Recovery Enhancements
Jan 16, 2004
Responding to a Customer's Security Incidents, Part 4: Processing Incident Data
Jan 9, 2004
Desktop Architecture Selection Guide
Dec 31, 2003
Sun ONE Portal Server 6 Best Practices
Dec 23, 2003
Migrating to the Solaris Operating System: Migration Strategies
Oct 31, 2003
Responding to Customer's Security Incidents--Part 3: Following Up After an Incident
Oct 31, 2003
Minimizing Domains for Sun Fire V1280, 6800, 12K, and 15K Systems, Part II
Oct 24, 2003
Using the LDAP to NIS+ Gateway
Oct 24, 2003
Deploying the Solaris Operating Environment Using a Solaris Security Toolkit CD
Oct 17, 2003
Minimizing Domains for Sun Fire V1280, 6800, 12K, and 15K Systems, Part I
Oct 17, 2003
Building Secure Sun Fire Link Interconnect Networks Using Sun Fire 15K and Sun Fire 12K Servers
Sep 26, 2003
Linux Overview for Solaris Users
Sep 26, 2003
Securing Sun Linux Systems: Part II, Network Security
Sep 26, 2003
Sun Fire V1280/Netra 1280 Server Considerations for Improving RAS
Sep 26, 2003
Sun ONE Portal Server and Lotus iNotes Integration Recipe
Sep 26, 2003
Transition Guide--Upgrading From the iPlanet Directory Server 5.1 Software to the Sun ONE Directory Server 5.2 Software
Sep 26, 2003
Capacity Planning as a Performance Tuning Tool—Case Study for a Very Large Database Environment
Sep 19, 2003
Securing Sun Linux Systems: Part I, Local Access and File Systems
Sep 19, 2003
Sun Fire 15K/12K Server Preferred Practices
Sep 19, 2003
Sun Grid Engine, Enterprise Edition—Configuration Use Cases and Guidelines
Sep 19, 2003
The IT Utility Model—Part I
Sep 19, 2003
Using filesync for Disaster Recovery, Business Continuance, and Mobility
Sep 19, 2003
Role Based Access Control and Secure Shell—A Closer Look At Two Solaris Operating Environment Security Features
Sep 12, 2003
Solaris Operating Environment Network Settings for Security: Updated for Solaris 9 Operating Environment
Sep 12, 2003
Using NTP on the Sun Fire 15K/12K Server
Sep 12, 2003
Consolidation Methodology
Sep 5, 2003
Using the Sun ONE Application Server 7 to Enable Collaborative B2B Transactions
Sep 5, 2003
An Architecture for Creating and Managing Integrated Software Stacks
Aug 29, 2003
Auditing System Security
Aug 29, 2003
Integrating the Secure Shell Software
Aug 29, 2003
Sun Cluster 3.0 Series: Guide to Installation—Part 2
Aug 29, 2003
Sun ONE Portal Server and Microsoft Exchange Integration Cookbook
Aug 29, 2003
Building a Global Compute Grid - Two Examples Using the Sun ONE Grid Engine and the Globus Toolkit
Aug 22, 2003
Configuring the Secure Shell Software
Aug 22, 2003
Responding to Customer's Security Incidents—Part 2: Executing a Policy
Aug 22, 2003
Sun Cluster 3.0 Series: Guide to Installation—Part 1
Aug 22, 2003
Sun Fire 6800/4810/4800/3800 Auto Diagnosis and Recovey Features
Aug 22, 2003
Provisioning in Replicated, Mission-Critical Environments
Aug 15, 2003
Responding to Customer's Security Incidents, Part 1: Establishing Teams and a Policy
Aug 15, 2003
Securing the Sun Fire 12K and 15K System Controllers
Aug 15, 2003
Writing an Authentication Plug-in for a Sun ONE Directory Server
Aug 15, 2003
Securing the Sun Cluster 3.x Software
Aug 8, 2003
Securing the Sun Fire 12K and 15K Domains
Aug 8, 2003
Understanding Gigabit Ethernet Performance on Sun Fire Servers
Aug 8, 2003
Using Midframe Servers to Build Secure Sun Fire Link Interconnect Networks
Aug 8, 2003
BluePrint for Benchmarking Success
Aug 1, 2003
System Management Services Software: An Inside Look
Aug 1, 2003
A Patch Management Strategy for the Solaris Operating Environment
May 23, 2003
Building OpenSSH—Tools and Tradeoffs
May 23, 2003
Configuring Databases Using Soft Links
May 23, 2003
Managing Shared Storage in a Sun Cluster 3.0 Environment With Solaris Volume Manager Software
May 23, 2003
Modeling Sun Cluster Availability
May 23, 2003
Performance Oriented System Administration For Solaris
May 23, 2003
A Strategy for Managing Performance
Apr 18, 2003
Solaris Operating Environment Security: Updated for Solaris 9 Operating Environment
Apr 18, 2003
Trust Modeling for Security Architecture Development
Apr 18, 2003
Understanding Solaris 9 Operating Environment Directory Services
Apr 18, 2003
A New Open Resource Management Architecture in the Sun HPC ClusterTools Environment
Feb 21, 2003
Campus Clusters Based on Sun Cluster Software
Feb 14, 2003
Memory Hierarchy in Cache-Based Systems
Feb 14, 2003
Designing Highly Available Architectures: A Methodology
Feb 7, 2003
Internet Protocol Network Multipathing (Update)
Feb 7, 2003
Minimizing the Solaris Operating Environment for Security: Updated for Solaris 9 Operating Environment
Feb 7, 2003
Configuring Boot Disks With Solaris Volume Manager Software
Jan 24, 2003
Managing Data Centers With Sun Management Center Change Manager
Jan 24, 2003
SQL*Net Performance Tuning Using Underlying Network Protocols
Jan 24, 2003
Extending Authentication in the Solaris 9 Operating Environment Using Pluggable Authentication Modules (PAM): Part II
Jan 17, 2003
HPC Administration Tips and Techniques
Jan 17, 2003
Sun Fire Midframe Server Best Practices for Firmware Update 5.13.x
Jan 17, 2003
Extending Authentication in the Solaris 9 Operating Environment Using Pluggable Authentication Modules: Part I
Dec 27, 2002
Sun Fire Systems Design and Configuration Guide
Dec 27, 2002
Consolidation in the Data Center
Dec 20, 2002
Enterprise Network Design Patterns: High Availability
Dec 20, 2002
Introduction to the Solaris Cluster Grid - Part 2
Dec 20, 2002
Introduction to the Sun Cluster Grid, Part 1
Sep 26, 2002
Sun's Quality, Engineering, and Deployment (QED) Test Train Model
Sep 26, 2002
Customizing JumpStart Framework for Installation and Recovery
Sep 20, 2002
Sun StorEdge Instant Image 3.0 and Oracle8i Database Best Practices
Sep 20, 2002
Windows NT Server Consolidation and Performance Improvements with Solaris PC NetLink 2.0 Software
Sep 20, 2002
Sun ONE Portal Server 3.0 Rewriter Configuration and Management Guide
Sep 13, 2002
Securing the Sun Fire 12K and 15K Domains, Updated for SMS 1.2
Sep 6, 2002
Securing the Sun Fire 12K and 15K System Controllers, Updated for SMS 1.2
Sep 6, 2002
An Information Technology Management Reference Architecture Implementation
Aug 30, 2002
Reducing the Backup Window With Sun StorEdge Instant Image Software
Aug 30, 2002
An Information Technology Management Reference Architecture
Aug 16, 2002
Drill-Down Monitoring of Database Servers
Aug 16, 2002
LAN-Free Backups Using the Sun StorEdge Instant Image 3.0 Software
Aug 16, 2002
Network Storage Evaluations Using Reliability Calculations
Aug 16, 2002
Securing LDAP Through TLS/SSL: A Cookbook
Aug 16, 2002
Securing the Sun Fire Midframe System Controller
Aug 16, 2002
Deployment Considerations for Data Center Management Tools
Aug 9, 2002
Guide to Installation-Part II: Sun Cluster 3.0 Software Management Services
Aug 9, 2002
How Hackers Do It: Tricks, Tools, and Techniques
Aug 9, 2002
Metropolitan Area Sun Ray Services
Aug 9, 2002
Securing the Sun Cluster 3.0 Software
Aug 9, 2002
Guide to Installation, Part I: Sun Cluster Management Services
May 24, 2002
Service Level Agreement in the Solaris OE Data Center
May 24, 2002
Solaris OE Enterprise Management Systems Part I: Architectures and Standards
May 24, 2002
Solaris OE Storage Resource Management: A Practitioner's Approach
May 24, 2002
Sun Fire 3800-6800 Servers Dynamic Reconfiguration
May 24, 2002
Using Live Upgrade 2.0 With JumpStart Technology and Web Start Flash
May 24, 2002
Enterprise Quality of Service Part II: Enterprise Solution using Solaris Bandwidth Manager 1.6 Software
May 17, 2002
Introduction to SunTone Clustered Database Platforms
May 17, 2002
Securing the Sun Enterprise 10000 System Service Processors
May 17, 2002
Service Level Management in the Data Center
May 17, 2002
Solaris Application Performance Optimization
May 17, 2002
Using Live Upgrade 2.0 With a Logical Volume Manager
May 17, 2002
Establishing a Solaris OE Architectural Model
Apr 5, 2002
Configuring OpenSSH for the Solaris Operating Environment
Mar 22, 2002
Data Center Design Philosophy
Mar 22, 2002
Enterprise Quality of Service (QoS): Part I - Internals
Mar 22, 2002
Issues in Selecting a Job Management System
Mar 22, 2002
Managing Solaris Operating Environment Upgrades With Live Upgrade 2.0
Mar 22, 2002
Securing Sun Fire 15K Domains
Mar 22, 2002
Server Virtualization Using Trusted Solaris 8 Operating Environment
Mar 22, 2002
Sun Cluster 3.0 Implementation Guide: Hardware Setup
Mar 22, 2002

Sorry, this author hasn't posted any blogs.

Backup and Restore Practices for the Enterprise

Like this article? We recommend
Backup and Restore Practices for the Enterprise

Case Study

We will determine a reliability figure on three very basic SAN architectures. The starting point of our study is the network storage requirements.

Network Storage Requirements

We want networked storage that has access to one server. Later, this storage will be accessible to other servers. The server is already in place, and has been designed to to sustain single component hardware failures (with dual host bus adapters (HBAs), for example). Data on this storage must be mirrored, and the storage access must also stand up to hardware failures. The cost of the storage system must be reasonable, while still providing good performance.

Our first temptation might be to decide which components to use; switches, hubs, Sun StorEdge_T3 arrays, Sun StorEdge_ A5x00 arrays, and so on. However, a more prudent approach would be to determine the appropriate architecture in terms of its resistance to hardware failures, cost, and performance, leaving the selection of specific components for a later stage.

NOTE

For this case study, the focus is on storage architecture redundancy and reliability, and does not address cost and performance issues.

Architecture 1

Figure 2FIGURE 2 Architecture 1 Block Diagram

Architecture 1 provides the basic storage necessities we are looking for with the following advantages and disadvantages:

Advantages:

  • Storage is accessible if one of the links is down.

  • Storage A is mirrored onto B.

  • Other servers can be connected to the concentrator to access the storage.

Disadvantages:

If the concentrator fails, we have no more access to the storage. This concentrator is a single point of failure (SPOF).

Architecture 2

Figure 3FIGURE 3 Architecture 2 Block Diagram

Architecture 2 has been improved to take into account the previous SPOF. A concentrator has been added, and now the storage configuration is redundant and the requirements are satisfied with the following advantages:

  • If any links or components go down, storage is still accessible (resilient to hardware failures).

  • Data is mirrored (Disk A <-> Disk B).

  • Other servers can be connected to both concentrators to access the storage space.

Architecture 3

Figure 4FIGURE 4 Architecture 3 Block Diagram

Architecture 3 seems very close to architecture 2. The main difference resides in the fact that Disk A and Disk B have only one data path. Disk A is still mirrored to Disk B, as required.

This architecture has all the advantages of the previous architectures with the following differences:

  • Disk A can only be accessed through Link C, and Disk B only through Link D.

  • There is no data multipathing software layer, which results in easier administration and easier troubleshooting.

In some sense it seems we are loosing a level of redundancy in architecture 3. To appreciate the differences between architecture 2 and 3, we will use block diagram analysis to determine and compare their reliability values.

Determining Redundancy

We first list an inventory of components involved in the three architectures as shown in the first column of the following table. Next, we analyze the three architectures for redundancy.

Failing Component (first failure)

Architecture 1: Is the System OK?

Architecture 2 and 3: Is the System OK?

HBA 1

Yes

Yes

HBA 2

Yes

Yes

Link A

Yes

Yes

Link B

Yes

Yes

Concentrator 1

No

Yes

Concentrator 21

n/a

Yes

Link C

Yes

Yes

Link D

Yes

Yes

Disk A

Yes

Yes

Disk B

Yes

Yes

Total number of redundant components

8

10


Consequently, we see that Architecture 2 and 3 satisfy our objectives for redundancy, while Architecture 1 does not.

It is possible to obtain an objective difference between architecture 2 and 3 by studying their respective reliability. We will find that, although both architecture 2 and 3 are fully redundant, one is more reliable than the other.

Determining Reliability

Using the reliability formulas discussed earlier, we can determine which architecture has the highest reliability value. For the purpose of this article, we will use sample MTBF values (as obtained by the manufacturer) and AFR values shown in the table below:

TABLE 1 Component Inventory

Component

AFR Variable

Sample MTBF Values (hours)

AFR2

HBA 1

H

800,000

0.011

HBA 2

H

 

 

Link A

L

400,000

0.022

Link B

L

 

 

Concentrator 1

C

580,000

0.0151

Concentrator 23

C

 

 

Link C

L

400,000

0.022

Link D

L

 

 

Disk A

D

1,000,000

0.0088

Disk B

D

 

 


NOTE

The example MTBF values were taken from real network storage component statistics. However, such values vary greatly, and these numbers are given here purely for illustration.

Architecture 1

Figure 5FIGURE 5 Architecture 1 Reliability Block Diagram

Having the rate of failure of each individual component, we can obtain the system's annual failure rate AFR1 and consequently the system reliability and system MTBF values. Using the block diagram (FIGURE 5), it is easy to identify which components are configured redundantly, and which are not. The following formula is derived using the block diagram analysis discussed earlier. The AFR values of redundant components are multiplied to the power equal to the number of redundant components. The AFR values of non-redundant components are multiplied by the number of those components in series. In this case, the concentrator (C) is the only non-redundant component (C * 1= C). And finally, the AFR values are summed.

The formula for this architecture:

AFR1 = (H + L)2 + C + L2 +D2

Sample values applied:

AFR1 = (0.011 + 0.022)2 + 0.0151 + 0.0222 + 0.00882 = 0.0167

Using the AFR value, we determine the annual reliability R1 of the system:

R1 = 1 – AFR1

R1 = 1 – 0.0167 = 0.9833, or 98.33%

Using the AFR value, the following system MTBF value is derived:

System MTBF = 8760/AFR1

System MTBF = 8760 / 0.0167 = 524,551 hours

Architecture 2

Figure 6FIGURE 6 Architecture 2 Reliability Block Diagram

This architecture has a different configuration, and the resulting formula is derived using the block diagram analysis.

The formula for this architecture:

AFR2 = (H + L + C + L)2 +D2

Sample values applied:

AFR2 = (0.011 + 0.022 + 0.0151 + 0.022)2 + 0.00882 = 0.005

Using the AFR, determine the annual reliability R2 of the system:

R2 = 1 – AFR2

R2 = 1 – 0.005 = 0.995, or 99.5%

Using the AFR value, the following system MTBF value is derived:

System MTBF = 8760 / AFR2

System MTBF = 8760 / 0.005 = 1,752,000 hours

Architecture 3

Figure 7FIGURE 7 Architecture 3 Reliability Block Diagram

Architecture 3 results in yet another block diagram calculation.

The formula for this architecture:

AFR3 = (H + L + C + L +D)2

Sample values applied:

AFR3 = (0.011 + 0.022 + 0.0151 + 0.022 + 0.0088)2 = 0.0062

Using the AFR, determine the annual reliability R3 of the system.

The formula:

R3 = 1 – AFR3

Numbers applied:

R3 = 1 – 0.0062 = 0.9938, or 99.38%

Using the AFR value, the following system MTBF value is derived:

System MTBF = 8760 / AFR3

System MTBF = 8760 / 0.0062= 1,412,903 hours

Conclusion

When the calculations are complete, we compare the data:

Architecture 1 = 98.33%, or a System's MTBF = 524,551 hours

Architecture 2 = 99.50%, or a System's MTBF = 1,752,000 hours

Architecture 3 = 99.38%, or a System's MTBF = 1,412,903 hours

The MTBF figures are the most revealing, and indicate that architecture 2 is statistically the most reliable of all.

In conclusion, the case study calculations provide the following points:

  • Only architecture 2 and 3 are fully redundant, hence they satisfy the requirement of a redundant configuration that can sustain a single hardware failure.

  • The reliability value for Architecture 1 doesn't show the non-redundant aspect of this architecture. It is therefore important to consider both characteristics; redundancy and reliability.

  • Architecture 2 is nearly three times more reliable than Architecture 1, and has an estimated higher MTBF of 339,097 hours when compared to architecture 3.

Finally, weighing the advantages of one solution over the another, we must also take other parameters into account, such as:

  • Storage capacity requirements

  • Performance

  • Cost

  • Maintainability (indexed by the MTTR: mean time to repair)

  • Availability (which depends on the MTBF and MTTR)

  • Serviceability

  • Ease of deployment

  • Support

The last point, support, is a critical consideration, because it is through support that a second failure will be avoided by quick troubleshooting and prompt part replacement. One factor not obvious in the calculations is that although we might think Architecture 2 brings more in terms of redundancy, due to the dual path from server to disks, it has the drawback of requiring additional software that can add another layer of complexity that might be less desirable (possibly lowering the ease of deployment and serviceability, while increasing costs).

Finally, it is worth noting that any storage area networking (SAN) implementation must be carefully planned and analyzed before deployment. Added to which, simple SAN design often will be preferable, because of easier support (troubleshooting and problem resolution). But one must not favor one parameter over the others without knowing the consequences, and therefore every aspect of the architecture decision must be considered. This is the only way to increase the reliability of storage architecture.

  • Share ThisShare This
  • Your Account

Discussions

Make a New Comment

You must log in in order to post a comment.

Related Resources

Rick KughenTop 10 Things to Do with Your BlackBerry After Purchasing an iPhone
By Rick Kughen on August 30, 2010 No Comments

Are you the proud owner of a new iPhone? Have an old BlackBerry that you don't know what to do with? Never fear. Following are 10 ways you can still enjoy your Blackberry (albeit temporarily):

Jamie AdamsNetwork World Subnet Communities Offer Pearson Author Insights & Giveaways
By Jamie Adams on August 9, 2010 No Comments

Every month Pearson imprint brands partner with Network World to offer up expert authors as featured bloggers for their community subnet sites. Focused on Cisco, Microsoft and Open Source, each community offers a variety of hot discussions, exclusive sample chapters and giveaways to their readers.

Emily NaveCommunity Tips: Starting a User Group Library
By Emily Nave on August 3, 20102 Comments

The Central Penn Adobe User Group (CPAUG) uses a library program to share books from different publishers with members. A short Q&A with group leader Megan Fister provides some great tips for starting your own.

See All Related Blogs

Informit Network