Previous Table of Contents Next


SQL Server Storage Requirements

The type of storage you purchase for your system greatly impacts your SQL Server performance. It also determines the maximum size of the databases you can support on your SQL Server installation. The first step in determining just how much storage you need is to determine the size of your database. You will need more storage than the estimated amount you calculate, however, if you want to have a workable SQL Server installation. At the very least, you should consider these points:

  Database Dumps—Plan on extra space for your database and transaction dumps. Having these available on your hard disk can provide a quick recovery if a failure occurs or your database becomes corrupted.
  Bulk Copies—You should plan on additional disk space so that you can re-create or resize your database easily, based on the data you have copied to your hard disk. A database dump can be used to restore a database, but, unlike a bulk copy, it requires the same storage allocations defined for the database. This prevents you from changing the size of the database. This bulk copied data also can be imported into other databases, such as a Microsoft Access database. This imported information could be used to support a mobile sales force, for example.
  SQL Server Schemas—A small amount of disk space should be reserved to contain your database schemas and SQL stored procedures.
  Temporary Database—Nothing eats up more disk space than your temporary database. This database starts out at a default size of 2MB, and, for small databases, this actually works. But the minute you start working with real-world databases that require extended joins or numerous sorts, you run out of temporary space rapidly. And this lack of space causes your queries to fail. I have seen instances where the temporary database needed to be between 64MB and 512MB. Remember the following basic rule when calculating the size of your temporary database: For each user performing a sort, the minimum space required in the temporary database is approximately three times the size of the table being sorted. If you have a really large number of users and large databases, don’t be surprised if you need 1GB or more of temporary space.
  Paging File—Depending on your installation, you may need to extend the size of your Windows NT Server paging file. This paging file determines the maximum number of clients that can connect to and use your SQL Server databases. This occurs because SQL Server needs the additional resources, and each connection to SQL Server uses its own thread. Each thread needs additional storage for its internal structures and to process the requested data.
  Total Disk Space—This should be at least three times the estimated database size, and five times the estimated database size is not out of the question. This rule takes into account the additional space needed to create new devices as your database grows beyond your initial estimate. It also includes the additional space needed to rebuild an index. Rebuilding a clustered index, for example, requires enough temporary storage to contain the table plus 1.21 times the size of the original index.


Calculating Your Database Size

You can determine the amount of storage required for your database based on the amount of storage required for each table in your database, each clustered index, and each nonclustered index. The formulas for calculating database sizes for tables that use fixed-length columns are the easiest to calculate and give the most accurate results. Database tables that contain variable-length columns must be averaged. The same principle applies to tables that use text or image columns. This sidebar presents some of the formulas you can use to estimate your database sizes.

Calculating your database size requires that you first calculate the size of your table and then calculate the size of your indexes. Then, combine these values. The values that follow in bold are overhead values used internally by SQL Server, unless noted differently.

To calculate table size, follow these steps:

1.  Obtain the number of pages required to hold the data in the table.
Determine the size of a single data row. This can be expressed for a data row with a fixed-length column (defined as NOT NULL), as follows:
RowSize = 4 + (Sum of bytes of all fixed length columns)

For a column with a data type defined as NULL, or for a data row with mixed, fixed, and variable-length data columns, the formula can be expressed as the following:
RowSize = (4 + (Sum of bytes of all fixed-length columns) +
(Sum of bytes of all variable-length columns)) +
((((4 + (Sum of bytes of all fixed-length columns) +
(Sum of bytes of all variable-length columns)) / 256) + 1 +
(Number of variable-length columns + 1) + 2)
2.  Determine the number of data pages required to hold the rows of data. This can be expressed as the following:
NbrOfDataPages = (Number of rows in the table) / (2016 / RowSize)

The value 2016 is the size of a data page based on the default fill factor size. If you change the fill factor percentage to 100, you should not subtract 2 when calculating the row size. If you change the percentage to a value other than 100, the new data page size can be expressed as (2016 * Percentage / 100). If you use a fill factor of 60 percent, the new data page size is 1210. This value is actually (2016 * 60 / 100) = 1209.6, but you should round up calculations for data pages.
3.  Determine the number of data pages required to hold your clustered index. First, determine the size of the clustered index. This varies based on whether the clustered index uses fixed- or variable-length columns. For a fixed-length column, the formula can be expressed as the following:
ClusteredIndexSize = 5 + (Sum of bytes of all fixed-length columns)

For a variable-length column, the formula can be expressed as the following:
ClusteredIndexSize = (5 + (Sum of bytes of all fixed-length columns) +
(Sum of bytes of all variable-length columns)) +
((((5 + (Sum of bytes of all fixed-length columns) +
(Sum of bytes of all variable-length columns)) / 256) + 1) +
(Number of variable-length columns + 1) + 2)
4.  Determine the number of data pages required to hold the clustered index. This can be expressed as the following:
While ((NbrOfIndexPages = (NbrOfDataPages / (2016 /
ClusteredIndexSize)--2)) > 1)
(SumOfNbrOfClusteredIndexPages = SumOfNbrOfClusteredIndexPages +
NbrOfIndexPages)
5.  Determine the number of data pages required to hold your nonclustered index. First, determine the size of the leaf index row. This varies based on whether the nonclustered index uses fixed- or variable-length columns. For a fixed-length column, the formula can be expressed as the following:
SizeOfLeafRow = 7 + (Sum of bytes of all fixed-length columns)

For a variable-length column, the formula can be expressed as the following:
SizeOfLeafRow = (9 + (Sum of bytes of all fixed-length columns) +
(Sum of bytes of all variable-length columns) + 1) + ((((5 + (Sum of bytes of all fixed-length columns) + (Sum of
bytes of all variable-
length columns)) / 256) + 1))
6.  Determine the number of leaf pages in the index, which can be expressed as the following:
NbrOfLeafPages = (2016 / SizeOfLeafRow)
7.  Determine the size of non-leaf rows. This can be expressed as the following:
SizeOfNonLeafRows = SizeOfLeafRow--4
8.  Determine the number of pages used by the non-leaf data. This can be expressed as the following:
While ((NbrOfNonLeafPages = (NbrOfLeafPages / (2016 /
SizeOfNonLeafRows)--2))> 1)
(SumOfNbrOfNonLeafPages = SumOfNbrOfNonLeafPages +           NbrOfNonLeafPages)
SumOfNbrOfNonclusteredIndexPages = SumOfNbrOfNonLeafPages +
NbrOfLeafPages
9.  Repeat Steps 1 through 8 for each nonclustered index for the table.
10.  Calculate the total number of data pages used by your table. This can be expressed as the following:
TotalNbrOfPages = NbrOfDataPages + SumOfNbrOfClusteredIndexPages +
SumOfNbrOfNonclusteredIndexPages
11.  Calculate the total size of your table in megabytes. This can be expressed as the following:
TableSizeInMB = (TotalNbrOfPages * 2048) / 1048576

Exceptions to the rules:

These calculations for variable-length fields use the maximum length, but if you know what the average size is of your variable-length fields, you can use that in the summary calculations. And if you use text or image fields, the size of the column will be a minimum of 16 bytes (for the pointer to the text/image page) plus the number of data pages. A data page for text or image can be calculated as:

NbrOfDataPages = LengthOfData (Rounded to nearest 2KB
value) / 1800

Along with the size of your paging file, you should consider the type of storage you plan to use. You should avoid any device that uses Programmed I/O (PIO) or any disk subsystem that uses the ATAPI.SYS driver, including all EIDE or IDE disk drives. Any disk subsystem that uses the ATAPI.SYS driver can support only one outstanding I/O request at a time, which severely limits your SQL Server performance. The best choice is to use a hardware RAID device based on a SCSI disk controller, because this provides the maximum performance. If you cannot afford a hardware RAID solution, consider a software-based solution that also uses a SCSI subsystem. If your budget is sufficient, purchase two identical hardware RAID devices. Create your database devices on one of these RAID drives, and then build a mirror image (using the SQL Mirror command) on the second RAID drive to increase performance even more.


Previous Table of Contents Next