Home > Articles > Operating Systems, Server > Solaris

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

2.7 Replacing Bad Devices Automatically

ZFS has the capability to replace a disk in a pool automatically without intervention by the administrator. This feature, known as autoreplace, is turned off by default. This feature will allow ZFS to replace a bad disk with a spare from the spares pool, automatically allowing the pool to operate at optimum performance. This allows the administrator to replace the failed drive at a later time. The manual disk replacement procedure is covered later in this chapter.

If you list the properties of the ZFS pool, you can see that the autoreplace feature is turned off using the get subcommand.

1   # zpool get all mpool
2   NAME   PROPERTY       VALUE       SOURCE
3   mpool  size           234M        -
4   mpool  used           111K        -
5   mpool  available      234M        -
6   mpool  capacity       0%          -
7   mpool  altroot        -           default
8   mpool  health         DEGRADED    -
9   mpool  guid           11612108450022771594  -
10  mpool  version        13          default
11  mpool  bootfs         -           default
12  mpool  delegation     on          default
13  mpool  autoreplace    off         default
14  mpool  cachefile      -           default
15  mpool  failmode       wait        default
16  mpool  listsnapshots  off         default

To turn on the autoreplace feature, use the following command line:

# zpool autoreplace=on mpool

To simulate a bad device, I shut down the system and removed c5t4d0 from the system. The system now cannot contact the disk and has marked the removed disk as unavailable. With the system rebooted, you can examine the output of the zpool status command:

1   $ zpool status mpool
2   pool: mpool
3   state: DEGRADED
4   status: One or more devices could not be opened.  Sufficient replicas exist for
5   the pool to continue functioning in a degraded state.
6   action: Attach the missing device and online it using 'zpool online'.
7   see: http://www.sun.com/msg/ZFS-8000-2Q
8   scrub: resilver completed after 0h0m with 0 errors on Mon Apr  6 00:52:36 2009
9   config:
10
11  NAME           STATE     READ WRITE CKSUM
12  mpool          DEGRADED     0     0     0
13    mirror       ONLINE       0     0     0
14      c5t2d0     ONLINE       0     0     0
15      c5t3d0     ONLINE       0     0     0
16    mirror       DEGRADED     0     0     0
17      spare      DEGRADED     0     0     0
18        c5t4d0   UNAVAIL      0    89     0  cannot open
19        c5t14d0  ONLINE       0     0     0  31K resilvered
20      c5t5d0     ONLINE       0     0     0  31K resilvered
21  spares
22    c5t14d0      INUSE     currently in use
23
24  errors: No known data errors

On line 3, the state of the pool has been degraded, and line 4 tells you that the pool can continue in this state. On lines 6 and 7, ZFS tells you what actions you will need to take, and by going to the Web site, a more detailed message tells you how to correct the problem. Line 19 tells you that the spare disk has been resilvered with disk c5t5d0. Line 22 now gives you the new status of the spare disk c5t14d0.

The original definition of resilver is the process of restoring a glass mirror with a new silver backing. In ZFS, it is a re-creation of data by copying from one disk to another. In other volume management systems, the process is called resynchronization. Continuing the example, you can shut down the system and attach a new disk in the same location of the missing disk. The new disk at location c5t4d0 is automatically resilvered to the mirrored vdev, and the spare disk is put back to an available state.

$ zpool status mpool
pool: mpool
state: ONLINE
scrub: resilver completed after 0h0m with 0 errors on Mon Apr  6 02:21:05 2009
config:

NAME        STATE     READ WRITE CKSUM
mpool       ONLINE       0     0     0
  mirror    ONLINE       0     0     0
    c5t2d0  ONLINE       0     0     0
    c5t3d0  ONLINE       0     0     0
  mirror    ONLINE       0     0     0
    c5t4d0  ONLINE       0     0     0  23K resilvered
    c5t5d0  ONLINE       0     0     0  23K resilvered
 spares
  c5t14d0   AVAIL

errors: No known data errors

If the original disk is reattached to the system, ZFS does not handle this case with the same grace. Once the system is booted, the original disk must be detached from the ZFS pool. Next, the spare disk (c5d14d0) must be replaced with the original disk. The last step is to place the spare disk back into the spares group.

$ pfexec zpool detach mpool c5t4d0
$ pfexec zpool replace mpool c5t14d0 c5t4d0
$ pfexec zpool add mpool spare c5t14d0
$ zpool status mpool
pool: mpool
state: ONLINE
scrub: resilver completed after 0h0m with 0 errors on Mon Apr  6 02:50:25 2009
config:

NAME        STATE     READ WRITE CKSUM
mpool       ONLINE       0     0     0
  mirror    ONLINE       0     0     0
    c5t2d0  ONLINE       0     0     0
    c5t3d0  ONLINE       0     0     0
  mirror    ONLINE       0     0     0
    c5t4d0  ONLINE       0     0     0  40.5K resilvered
    c5t5d0  ONLINE       0     0     0  40.5K resilvered
spares
  c5t14d0   AVAIL

errors: No known data errors
  • + Share This
  • 🔖 Save To Your Account