THE "V"WORD
The key to capacity allocation efficiency in SANs was "volume virtualization," a technique for creating large and scaleable "virtual disks" or "array partitions" from multiple individual disk drives. In the late 1990s (and today), the creation of virtual volumes was mainly a function of disk array controllers.
Put simply, multiple individual disk drives within an array are aggregated into "virtual volumes" by means of firmware embedded on the disk array controller. This virtualization function was traditionally performed by the array controller for much the same reason that external software for building Redundant Arrays of Inexpensive (changed later to "Independent") Disks (RAID) found its way onto controllers: performance.
Like external software-based RAID, external software-based virtualization tended to incur a "write penalty" that became obvious during tape transfers or other large-scale data movements targeted to the disk drives in the array. The cause was simple: For each block of data directed to the virtual volume, the external software used to create the virtual volume needed to be interrogated to determine the appropriate physical disk target within the virtual volume for the write operation. Write commands amassed (or queued) quickly because of the inefficiencies of this process, and latency began to accrue that was sometimes referred to in the virtualization world, as it had been in the RAID world, as a form of the "write penalty."
Years earlier, in the case of RAID, engineers sought to surmount the write penalty by placing their RAID code into fast-executing silicon, wedded directly to the array controller. Combined with often-complex memory caches to queue commands and data, the write penalty was finessed by sheer engineering muscle.
In an effort to build the best RAID, vendors of high-end arrays made considerable investments in their proprietary controller designsnot knowing that they would later be creating a central obstacle to the realization of the SAN vision. With the advent of SANs and the desire to create aggregated volumes from disks, virtual disks and/or partitions implemented on different arrays by different controllers from different manufacturers, the obstacle finally presented itself. The proprietary differences in RAID array controllers contributed to the many problems associated with the formation of SAN-based virtual volumes from heterogeneous arrays.
Early on it was nearly impossible to create a virtual volume using disks or virtual disks/partitions, identified by Logical Unit Numbers (LUNs), from the arrays of different manufacturers. The controllers of different vendors simply did allow mix-and-match operation, and in a few cases, vendors used "warranty locks" to prevent such configurations. That is, the IT manager would void his warranty on vendor A's hardware if he/she included it in a SAN with vendor B's hardware or used the disk components of A in concert with B to create a virtual volume.
The challenge of array heterogeneity persists today. Until viable third-party virtualization software engines appear, the only way that true volume virtualization can be accomplished in a FC SAN is to purchase all the same brand of hardware from a single vendor (or vendor cadre). In other words, the storage devices connected to the SAN must be homogeneous. For most companies, accommodating this requirement would mean ripping out a lot of existing arrays and replacing them with the products of a single vendoran option that fails to enthrall many IT managers.
Even in cases where IT decision-makers elected to "go homogeneous," the promise of automatic capacity allocation efficiency was rarely delivered. Most virtualization schemes enabled only the aggregation of LUNs and did not permit "LUN carving and splicing." In other words, you could not take the unused capacity in one LUN (carve) and simply bolt it to another LUN (splice). The technology didn't exist. (See Chapter 7 for more information about the changing capabilities of SAN virtualization technology.)
By 2002, volume virtualization came to mean techniques for aggregating LUNs that had been defined at the time that homogeneous arrays were first deployed. Unless LUNs were defined at the level of the individual physical disk drive, LUN aggregation was never capable of providing the fine levels of granularity that real capacity allocation efficiency required.
Today, the state of capacity allocation efficiency, according to Fred Moore, a respected analyst and CEO of Horison Information Strategies, is abysmal.[1] According to Moore's calculations, storage capacity is allocated to about 60 percent of optimal efficiency in mainframe shops. In UNIX and Microsoft Windows shops, capacity allocation efficiency hovers at around 40 percent of optimal, and in Linux shops it is around 30 percent.
These allocation percentages show how much storage is potentially wasted in most facilities and underscore that we have quite a way to go to achieve the efficient allocation of storage capacity promised by SAN vendors in the open systems world.
Virtualizationthe "V" wordhas not materialized in such a way that it enables the creation of heterogeneous and dynamically scalable volumes that SANs were supposed to deliver. In the final analysis, the promise of capacity allocation efficiency, articulated by nearly all SAN vendors, has never been delivered.