High Performance NT4 Optimization and Tuning:Implementing Redundant Systems

Understanding A Stripe Set With Parity

A stripe set with parity consists of an array of 3 through 32 physical disk drives combined into a single virtual volume. The virtual volume is segmented into discrete components called blocks. Each block is 64K in size. These blocks are combined to form another unit called a stripe. Figure 9.4 illustrates an example of a stripe set with parity.

Figure 9.4 An example of a stripe set with parity.

You cannot install Windows NT Server on a stripe set, so this example shows two volumes. The first volume (C: drive) contains the operating system. The second volume (D: drive) is the stripe set and consists of three disks combined into a single virtual volume. Each stripe contains one block of data called a parity block. The parity block is the exclusive OR (XOR) of the combined data blocks. What this means to you is that, in the event of a single disk failure, the parity block can be combined with the resultant data blocks to recover the missing data block (the block in the stripe on the failed drive). This enables you to continue providing services to your clients until you can shut the server down for repairs. After replacing the failed drive, you can regenerate the missing information on the disk by using the same method (combining the data blocks with the parity block).

If desired, you can mirror the boot volume (C: drive) for improved fault tolerance. This could be helpful in keeping your server up and running at a minimal cost in a heavily taxed environment.

Should a single disk failure like this occur in reality, however, I suggest you immediately replace the failed drive unless keeping the server up and running is critical to your well-being. Even then, you should avoid putting any new data on the drive to minimize your recovery time. You should also plan on having at least one disk drive on hand to replace a failed drive. This drive should have already been tested to make sure it contains no defects. Once tested, just put it back in the box and keep it handy in case of emergencies.

One of the major advantages of a stripe set with parity is the amount of storage used to hold the redundant information (the parity block). Unlike a duplexed mirror set, a stripe set does not halve your storage capacity. Instead, it utilizes a portion of each disk to store the parity block to recover data in the event of a single disk failure. The amount of storage used to store parity blocks can be calculated by the following formula:

Total Parity Storage = 1 / Number of physical disks used

If you have a stripe set using 3 drives, as in our example, 1/3 of the capacity would be used to store the parity information. While better than 1/2 of the lost storage of a mirror set, it is not much better. A stripe set with parity consisting of 5 disks would only lose 1/5 of the total storage, while a stripe set with parity consisting of 10 disks would only lose 1/10 of the total storage. And, should you be so lucky to have one, a stripe set with parity made up of 32 drives would only lose 1/32 of the total storage capacity. As you can see, the more drives you use, the more cost effective it becomes.

Having more drives is also more efficient in terms of performance. If you are looking for the maximum possible performance, consider using several SCSI adapters with multiple disk drives. An Adaptec 2940UW, for example, can transfer data at up to 40MB/sec. If you use two Adaptec 2940UW disk controllers, place multiple drives on each controller, and stripe the resultant disk, you can (in theory) transfer data at 80MB/sec. If you use a SCSI adapter with multiple channels, such as the Adaptec 3940UW, you can obtain the same results using a single adapter.

A dual channel adapter functions much the same as if there were two separate SCSI adapters on a single card. This might be more cost effective than two adapters for you. If you want even more performance, you can use multiple 3940UW adapters with at least one disk on each channel. By striping the resultant disks, you could reach a theoretical 160MB/sec. transfer rate—well above the PCI maximum transfer rate. Even though I hate to burst your bubble, the real-world situation does have to be mentioned. In reality, your data transfer results would be quite a bit less due to the overhead of the drivers, associated hardware, parity block calculations, and writing the parity blocks for the stripe.

Note: Hardware-based RAID solutions, such as the Adaptec AAA-130 UltraSCSI adapter, can transfer data as fast as the PCI bus can handle (133MB/sec.) and are more likely to be a more cost-effective solution to using multiple adapters, such as the 3940UW. A hardware-based RAID device also offloads the work from the CPU to calculate the parity block and can improve overall server performance.

While I have mentioned most of the benefits of a stripe set with parity, I have not discussed the disadvantages. So, it’s time to consider a few of these. First of all, there is the loss of storage capacity, although I consider this a minor issue if you use at least five disks for the set. The major disadvantage is that a stripe set with parity is a poor performer for applications that require high bandwidth write capacity. These are usually applications that collect data on a frequent (realtime) basis or that use a transaction-based model. For example, an SQL Server database with a high number of transactions would not perform as well as it could due to the overhead involved in calculating and writing the parity blocks. A better choice would be to use a stripe set without parity for maximum throughput.

Now that you understand how the redundant disk systems function, as well as some of the pros and cons, you might be interested in how to actually create them. The next section shows you how to use the Windows NT Disk Administrator to create both mirror sets and stripe sets with parity. So, let’s get started.

Table of Contents