Press "Enter" to skip to content

Drive Failure

SQL Wayne cautions you to think about drive failure:

Now is when the MTBF comes in. If all of the drives were from the same batch, then they have approximately the same MTBF. One drive failed. Thus, all of the drives are not far from failure. And what happens when the failed drive is replaced? The RAID controller rebuilds it. How does it rebuild the new drive? It reads the existing drives to recalculate the checksums and rebuild the data on the new drive. So you now have a VERY I/O intensive operation going on with heavy read activity on a bunch of drives that are probably pushing end of life.

This is where it’s important to keep spares and cycle out hardware.