Detecting RAID Failures

Randolph West has an interesting after-the-fact test to determine whether data on a RAID array which experienced major failure is recoverable:

We took a look through some random files on the disk. I was looking for evidence that the VMDK file that had been copied was taken from a RAID array in a corrupt state, namely that one of the disks had failed, and that the second disk had failed during the rebuild of the array.

The easiest way to see this is to look for a JPEG or PNG file over 256 KB in size. Most RAID block sizes are multiples of 64 KB, usually 128 KB or 256 KB. Each block is split over the individual physical disks, with a parity bit, so for a particular block of data, if the RAID array has failed, you will see a corrupt image, or the image won’t be viewable at all.

Randolph presents an interesting smoke test here.  Read the whole thing.

Related Posts

Running SQL Server On Cluster Shared Volumes

Sreekanth Bandarla continues a series on clustered shared volumes: In the previous part of this series, we have seen what a cluster shared volume is and what are the advantages and other considerations to keep in mind when deploying CSVs for SQL Server workloads. In this article, I will walk though actual installation of a failover cluster Instance […]

Read More

Switching To Managed Disks In Azure

Chris Seferlis walks us through an easy method to convert unmanaged disks to managed disks in Azure: First off, why would you want a managed disk over an unmanaged one? Greater scalability due to much higher IOPs and storage limits. There’s no longer the need to add additional storage accounts when you’re adding disk space, […]

Read More

Categories

October 2017
MTWTFSS
« Sep Nov »
 1
2345678
9101112131415
16171819202122
23242526272829
3031