Home > Articles > Programming > Java

Jiro Storage Networks

In this sample chapter you'll obtain a taste of the complexities of data storage, and the variety of software and devices that comprise it. Also learn about the wide variety of connection protocols and configurations that go into assembling effective enterprise-class storage solutions.

Perhaps the most difficult part of writing this book was to decide how much information to include about storage networks and storage techniques in general. On the one hand, the Federated Management Architecture (FMA) and Jiro can be applied to virtually any management solution. On the other hand, FMA was originally built with a direct focus on storage, so many of the architectural decisions can be justified if this is obvious from the start.

The basis for much of the content of this book is the concept of a storage network. The partitioning of storage data, management, and operations from an overall production network into a dedicated storage network is a relatively new trend and a quickly evolving field of study. There are many different reasons to separate storage traffic from a production network:

  • You avoid having users overwhelm a network and cut off storage traffic, or vice versa.

  • You allow the storage network to be optimized for particular quality of service (QoS) attributes that may differ from quality of service parameters required in a production network.

  • You prevent confusion between storage management and network management, two tasks that have largely different concerns and needs.

  • You allow the storage network to use a network protocol optimized for storage access and data movement.

There are more reasons to maintain a split between a production network and a storage network, but much more detail can be derived from one of the many books on storage networks that are available. In addition to storage networks, FMA and Jiro must be able to manage storage that is available on a production network in two other forms.

  • Direct attach storage, which is attached directly to a host's bus. A typical example of this is the hard drive in your personal computer or server.

  • Network attach storage (NAS), which is a class of systems that provide file services to host computers. A host system that uses NAS uses a file system device driver to access data using file access protocols such as the Network File System (NFS) or the Common Internet File System (CIFS). NAS systems interpret these commands and perform the internal file and device I/O operations necessary to execute them. 1

To manage storage as a whole, a person first thinks of the hardware that is required for storage management: routers, switches, disk devices, tape devices, and more. 2 What people sometimes forget is the wide variety of software that goes into day-to-day storage management. Storage management of any kind could not be achieved without the software. Software components for managing storage include the following:

  • Device drivers: layers of code on hosts that translate operating system requests to device requests.

  • Management console: software that allows monitoring of particular resources.

  • Backup management tools: policy-based tools for scheduling and maintaining backups and archives of live data.

  • Volume and file manager: tools that allow hosts to access data in hierarchical formats using custom file systems with adequate security.

As an enterprise or medium-sized business grows, more storage is required. Furthermore, as businesses become distributed or embrace the Web, the amount of time that storage must remain online increases. For many businesses, it is essential for storage to remain online 24 3 7 3 365. The availability requirement alone is a primary driver for storage networks. It is difficult to replace a hard drive that is directly attached to a host without bringing the host down during installation.

According to research by IDC, production storage between 1999 and 2004 is on course to grow by 10,000 petabytes—that is, 10,000,000,000,000,000,000 bytes of information. Accompanying this increase in storage will be an increase in storage management costs, and all this is coupled with a tight worker market. This combination spells trouble for end users. Storage administrators and companies with storage issues will attempt to solve problems in a variety of ways:

  • Flexibility. The main goal of flexibility is to predict future storage network requirements early to decrease the impact and maintenance when growth is needed. An example is to outsource a large part of the storage networking needs to a company that specializes in this storage network, such as a storage service provider (SSP). The biggest single issue with an SSP is trust: Does your company trust your data to be sent offsite to another company? There are other ways to increase storage flexibility, including redesigning the existing storage network in a modular and expandable fashion.

  • Time balancing. Who is impacted, and what is the company tolerance for using and paying for time? For example, acknowledging that you cannot afford to reengineer the storage network or hire additional resources means that you will impact your employees and customers due to maintenance time as your storage needs increase. Furthermore, the company will not be able to take advantage of new storage opportunities that could create more efficient use of time. The company could also choose to substantially increase the amount of time spent dealing with storage networking. This approach acknowledges the value of the employee and customer information, but if the company lacks the ability to be flexible, time invested in the network will increase linearly (or exponentially) with the amount of storage added.

  • Resources. Adding administrators to address storage networking needs increases the total cost of ownership (TCO) but does not necessarily increase the efficiency of the storage network. Resources can be acquired in the form of onsite storage networking consultants who are dedicated to the maintenance of your systems. To some degree, the issue of trust is relieved with this option, although it does require higher capital expenditure.

No matter how a business chooses to address its ever-increasing storage needs—probably through a combination of these approaches—there is another variable that can aid in creating an effective storage management plan: storage management software. The simple facts are that stored information is increasing exponentially, and it is unlikely that the number of storage management professionals will increase exponentially during the same time. The only answer to this dilemma is to create effective storage management tools that allow storage management experts, whether they are onsite or hired hands, to more effectively manage increased storage without increasing the number of experts or their training time.

A tool that proactively monitors your storage network and asks for help only when necessary is sometimes referred to as the Holy Grail 3of storage management. In many cases, this level of management can be achieved if you are willing to build storage networks with products from a single vendor. By choosing a single vendor solution, however, you are tied into its pricing and support mechanisms, forcing you to trust a single vendor with your data and your budget.

The truth is that the storage industry suffers from commodity pricing. By allowing businesses to choose a quality of service level and a corresponding price point for the quality of service, the industry enables businesses to grow their network without bounds and based on their own constraints of budgeting vs.QoS. The problem today with heterogeneous storage networks is that each vendor of a component within the storage network often uses its own management techniques.

From the point of view of the storage manager, we are back to the first problem—increasing the amount of storage increases the number of storage management issues that must be dealt with. For example, by purchasing two fibre 4 channel switches from two different companies, you require your storage management experts to understand two management consoles. 5

The Federated Management Architecture from Sun is meant to bring heterogeneous environments back down to a single point of control. Furthermore, the architecture dictates policy-based solutions that can grow unbounded with a storage management network.

This chapter discusses the nuts and bolts of data centers, including management techniques and protocols as well as the hardware and software involved in a storage solution. After discussing storage and storage management, we explain how FMA and Jiro fit in to the storage management picture.

The important thing to take from this chapter is not necessarily an understanding of heterogeneous storage networks vs. homogeneous storage networks, or one type of hardware vs. another type of hardware. The essential information is simply that all these types of hardware and software exist. They all must be managed, no matter who is doing the management for you. Your goal should be to try to understand how a device ends up being managed by software, and how software itself also requires management from a policy-based solution.

2.1 Storage Hardware

Beyond host computer systems, there are two primary categories of hardware to consider. In general, there are the physical devices that store data and the network support that helps move the data to and from the correct locations. Both categories contain many different kinds of devices. A few of the devices in each category are profiled here.

Each type of device and configuration has trade-offs. For example, the managed fibre channel switch profiled later seems like a perfect device for network management. The drawbacks of a switch versus an average low-cost hub are that switches involve propagation delay and tend to be expensive.

On the other hand, low-price hubs give no indication of trouble in a network, can be difficult to manage, and share bandwidth between all attached devices (switches can allocate all bandwidth to multiple zones). These limitations have a direct impact on the ability of a storage administrator and storage management software to detect problems in the storage network.

Again, you should devote thought to each storage network before spending the company's budget. Even within a single data center, a wide variety of hardware devices can be employed to fit the characteristics and QoS of a particular department or area.

2.1.1 Disk Devices

If you are coming from a PC-centric background, when you think of storage, you think of the drives that are attached to the bus in your system. This isn't far from the truth of implementation for many large installations. Host file servers often contain direct attach storage, which is physically contained within a host. The host then shares these disks through a network file protocol such as NFS or CIFS. To expand storage, the system administrator brings down the host, adds a drive to the server tower, configures it, and shares it.

In large data centers, storage is more partitioned than in the physical containment model used in hosts. There are many reasons for this partitioning. One is that mainframes have traditionally been very good at separating storage from the systems. Another reason is simply that large data centers have encountered problems with the old model and have already started to partition into storage networks as their solution. Physical drives fit into rack-mounted cabinets that are 19 inches wide and of variable height depending on the contents of the rack-mounted equipment.

Redundant arrays of independent disks (RAID) hardware enables high-performance data retrieval and high availability of data through the use of multiple disks. Basically, to enable high performance, data is spread over multiple disks to allow parallel reads and writes to the disks. By having more disk arms moving, you relieve a major performance bottleneck: the disk arm. To enable high availability, data is striped across disks, and then parity bits are used to enable recovery of lost data. In the basic RAID levels, parity is used to enable recovery of one lost disk in the disk array. So if four disks are being used and one crashes, the crashed disk can be replaced and the data retrieved from the parity bits.

RAID levels, 0 to 5, give different levels of redundancy or performance. Advanced RAID techniques combine the RAID levels to try to give performance and high availability. The basic RAID levels are

  • Level 0: striping

  • Level 1: mirrors

  • Level 3: dedicated parity disk

  • Level 4: parallel access with parity disk

  • Level 5: parallel access with distribution parity

Combining some of the RAID levels makes implementations more expensive (in terms of hardware and possibly performance), but it creates benefits that combine the best of both techniques. For example, RAID level 0 combined with level 1 can give fast read and write access as well as good data redundancy.

RAID devices are put in the hardware section, but the location of the RAID implementation varies widely. RAID can be implemented in three places:

  • Onboard a physical disk array

  • In a controller card residing in a server system

  • In software, such as a logical volume manager

Where you implement RAID capabilities affects both the cost and the effectiveness of the implementation. For example, using software RAID implementations may be inexpensive, but it creates a burden on the host that implements the RAID capabilities. The software is burdened with manipulating the distribution of data across physical devices. This robs memory and valuable processor cycles from the file-serving processes. The result is that increased traffic to the host increases the demands on the file-sharing software as well as on the software RAID controller, a double hit to the server at a time when you would prefer to lighten the load on the processor to aid in processing requests. To relieve the host, RAID implementation can be moved to controller cards or onto the disk arrays themselves. Typically, this locks the RAID implementation into a single vendor, but it can create a very effective implementation. The decision of where to implement RAID in a storage network is an important one.

Just a bunch of disks, more widely known as JBOD devices, are low-cost devices that contain . . . a bunch of disks. There are many different ways to configure the disks. Typically, the JBOD is in a rack enclosure, and you hot-swap drives in and out of the JBOD. Whereas literal RAID device has the RAID capabilities onboard the device, if you want to use some or all of the disks available in the JBOD for RAID configurations, it must be controlled by software or an external RAID controller.

Network attach storage on the low end fits into the category of disk devices. Devices fit into several price groups. On the high end of NAS price points, NAS involves a rack-mounted system that attaches to an IP network. The high-end device typically contains one or more disk drives that can be configured in various RAID configurations. In the low-end price range, you will likely find software-based RAID, limited management capabilities, and very limited backup capabilities. Furthermore, on the low end, stand-alone devices are available that can sit on desktops or even in the home. Onboard any NAS device is what could be termed a specialized operating system that is optimized for file serving. In this operating system many of the general functions of the kernel and operating system are removed, such as any graphics capabilities, extraneous port handling drivers (for USB or parallel devices), and other optimizations that can be found for the specific device. The file system, volume management, and security are all built into the operating system and services that are hosted on the NAS device. Plug in the NAS, and you have instant space available via CIFS or NFS attachable directories.

Higher-priced NAS devices contain a huge amount of functionality. They contain everything from built-in tape libraries for archiving and backup to custom file systems built for network sharing of data.

2.1.2 Tape Devices

There are essentially three types of tape storage enclosures that systems can use:

  • Single tape drive. Targeted at user data backup, single tape drives often exist on servers or single-user computers that contain important data.

  • Tape autoloader. This device loads tapes automatically and contains a single read/write head. This is really a degenerate case of a tape library (discussed next).

  • Tape library. Much larger than a tape autoloader, this device often contains multiple read/write heads.

For management purposes, the physical devices are important, but much of the data management will be accomplished through backup/archive manager or hierarchical storage manager (HSM) software, both of them covered later in this chapter.

2.1.3 Storage Networking Hardware

A variety of devices make up the category of what can be considered storage networking hardware. Later in this chapter we talk more about what it means to create a storage network, but the devices that fall into this category are similar to traditional networking hardware. Hubs, routers, and switches are combined to make up a network infrastructure. Each device has different capabilities as far as network management is concerned, and each is used in a different way.

  • Hubs. These devices provide a low-cost, easily installable way to expand a storage network. Hubs have two major drawbacks. One is that they tend to be less "manageable" than switches. The second is that bandwidth is shared between all the devices on the hub. A switch has the ability to partition devices and maintain full bandwidth to each partition of devices, even in a degenerate configuration in which each attached device is in its own zone. In this degenerate case, each attached device has full bandwidth. This configuration is not possible with hubs.

  • Switches. Like hubs, switches allow network expansion. The difference is that the switches have more management capabilities, more configuration options, and typically have some ability to debug and maintain performance in fibre channel network. The switch forms the center point of what is known as a fabric. The switch can route data between ports of any two devices that are connected to the fabric. You can also create logical partitions of the fabric, known as zones, which give full throughput to all logical partitions. Finally, a switch is often able to detect a misbehaving component and eliminate it from the fabric without impacting the remaining devices. The downside of switches is that they tend to be much more expensive than hubs and can introduce a small amount of propagation delay. Expensive hubs and inexpensive switches can overlap in capabilities. Furthermore, in the future it is likely that low-end hubs will actually become low-end switches as components used in switches hit lower and lower price points.

  • Routers. Used for routing network traffic, routers let you add a variety of features to make them an integral part of a storage network. For example, some routers can convert fibre channel protocol traffic to parallel SCSI traffic, allowing you to attach legacy SCSI devices, such as tape libraries, to a fibre channel network.

In some cases, switches and hubs can be used interchangeably. Switches are more manageable than hubs but incur some propagation delay depending on their zoning options. On the other hand, a switch will remove a misbehaving device from a storage network automatically and will often signal the administrator in multiple ways, perhaps through a nice red LED.

In addition to the devices that form a network infrastructure, controller cards attach devices to the physical network. Sometimes these are termed host bus adapters, or HBAs. If you have multiple HBAs installed on a host, one HBA can fail while a storage network connection continues to be available. HBAs are similar to a network interface card (NIC).

The hubs, switches, and routers discussed in this section come in two forms: one for fibre channel networks and one for IP networks. A quickly advancing standard known as SCSI over IP moves the most popular storage protocol, SCSI, to an IP network. With the advent of SCSI over IP, similar management tools and hardware can be used to manage both the client network and the storage network. Increasing the capabilities of the management tools for these networks and creating one set of hardware for a complete network (storage and production) will lower the total cost of ownership for storage networks.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020