High Performance NT4 Optimization and Tuning:Implementing Redundant Systems

Chapter 9
Implementing Redundant Systems

• Understanding Mirror Sets, Duplexed Mirror Sets, And Striped Sets With Parity

• Creating A Redundant Or Fault Tolerant Disk System

• Recovering A Redundant Or Fault Tolerant Disk System

As a system administrator of a small or large network, you should have a plan to cope with the problems you might encounter. Of course, you can’t prepare for every contingency, but you can plan for worst-case scenarios. The worst thing that can happen to you is a complete system failure, a failure so bad that you have to completely rebuild your server from the ground up. This means a brand new computer, not one cobbled together from spare parts you have lying around. Should you have to rebuild your server, however, you could be offline for quite some time. This offline time could be anywhere from hours, to days, to weeks. If this is your only server, it means your entire network might collapse. This can place you in a very bad position with your employers.

So, at the very least, you should have two servers. One primary domain controller and one backup domain controller. If your primary domain controller fails, you can then promote the backup controller to a primary controller using Server Manager, and your network will continue to operate. Even your computer and user account databases will be maintained. But the data on your primary controller will be lost. The only way you will be able to recover this data is from your last backup to tape. So, I hope you have a backup plan—one that includes full monthly backups, incremental weekly backups, and daily backups performed each night.

TIP: The daily backup method used with NTBACKUP.EXE (the Windows NT Backup program) will only back up files that have been modified that same day. This means that if you use the daily backup option, you must schedule your backup to start and finish before midnight. Otherwise, the backup will not include the modified files.

The worst-case scenario, of course, rarely occurs in real life. You usually have an indication when a server component is going to fail. You might notice that a cold boot takes two or more attempts before the server boots, while a warm boot always succeeds. This is an indication that your hard disk drive’s read/write head is out of alignment or the disk drive is just wearing out. Or, you might hear a loud disk whine while the disk is in operation. This is an indication that a spindle bearing is beginning to fail. If the disk drive is just a data drive, both of these problems can be fixed by replacing the disk drive and copying the data from the failing drive to the new drive. If the disk is your boot drive, you’ll have to reinstall Windows NT and use your last backup to restore your original configuration. Neither of these are optimal situations. Which leads us to the subject of this chapter—redundant disk systems.

Understanding Redundant Disk Systems

A redundant disk system can be considered a built-in safety mechanism. This safety mechanism constantly checks on the reliability of the primary system. Should the primary system fail, the primary system will go offline, and the secondary system will take its place. This keeps the computer up and running instead of forcing you to take the entire computer offline to repair the damage immediately. Windows NT Server supports three different redundant or fault tolerant mechanisms for your disk subsystem—mirror sets, duplexed mirror sets, and striped sets with parity. Both a mirror set and duplexed mirror set are redundant methods, while the stripe set with parity is a fault tolerant method and will be included as an option in this chapter. Chapter 1 introduces these concepts in the section entitled “Fault Tolerant Capabilities.” In this chapter, we’ll explore these options in much greater detail. Not only will you learn about the technology, but you will learn how to build and manage redundant and fault tolerant disk subsystems using available hardware. So, let’s begin with a mirror set—the least complicated fault tolerant disk subsystem.

Table of Contents

Part IVFault Tolerance And Data Integrity

Chapter 9Implementing Redundant Systems

Understanding Redundant Disk Systems

Part IV
Fault Tolerance And Data Integrity

Chapter 9
Implementing Redundant Systems