View Full Version : Music server RAID drive failure?
As many of you know only too well by now, by fits and starts I've been setting up an old computer as a music server. It stores the files on a pair of identical Western Digital SATA 250 GB internal drives controlled by a Silicon Image PCI card and set up as a RAID with drive "0" mirrored to drive "1."
A few minutes ago, the computer froze as I compressed some newly-copied tracks to .ape. I shut it down and did a cold boot; as it came back up, I got a warning message about "incomplete RAID set," and, upon entering the setup screen, was presented with the following:
Set0 SiI Mirrored Set
1 WDC 2500JS-00SGB0 Current
0 WDC 2500JS-00NCB1 Rebuild
Without making any changes, I shut the computer back down and left it that way pending receipt of advice.
Do I understand this correctly to mean that the drive that I was using as my "master" (or whatever you want to call it--the source being mirrored to the other drive) has developed a fault and must be replaced? Needless to say, (a) if so, I'm fervently hoping that the other drive has, as intended, retained good copies of my data, and (b) I will deeply appreciate any guidance my fellow forum members can offer, be it confirming my "take" on the situation or suggesting how to proceed from here.
longjohn
05-23-2007, 10:58 AM
As many of you know only too well by now, by fits and starts I've been setting up an old computer as a music server. It stores the files on a pair of identical Western Digital SATA 250 GB internal drives controlled by a Silicon Image PCI card and set up as a RAID with drive "0" mirrored to drive "1."
A few minutes ago, the computer froze as I compressed some newly-copied tracks to .ape. I shut it down and did a cold boot; as it came back up, I got a warning message about "incomplete RAID set," and, upon entering the setup screen, was presented with the following:
Set0 SiI Mirrored Set
1 WDC 2500JS-00SGB0 Current
0 WDC 2500JS-00NCB1 Rebuild
Without making any changes, I shut the computer back down and left it that way pending receipt of advice.
Do I understand this correctly to mean that the drive that I was using as my "master" (or whatever you want to call it--the source being mirrored to the other drive) has developed a fault and must be replaced? Needless to say, (a) if so, I'm fervently hoping that the other drive has, as intended, retained good copies of my data, and (b) I will deeply appreciate any guidance my fellow forum members can offer, be it confirming my "take" on the situation or suggesting how to proceed from here.
Nothing should take the place of whatever documentation you've got on your RAID hardware, and you should definitely check whatever log files may have been generated, but it looks like drive 0 failed and is being rebuilt with the data from drive 1. You should let the rebuild complete and replace drive 0 when possible, at which point it will rebuild again. You MUST follow the RAID controller manufacturer's instructions when replacing the drive, whatever those instructions are. I highly doubt you are using hot-plug drives, which will mean shutting the system down to do the replacement.
The data on the good drive (1) should be all but current; since you shut it down due to a lockup though, you may have lost the most recent writes.
By saying drive 0 "failed" I mean it could have been detected by the RAID hardware (or an OS RAID driver) as being in a "pre-failure" state, which explains how it could be rebuilt. Usually this means some predefined error threshold was exceeded although the drive is still functional (for now). Obviously if it had failed in the sense of seized, there wouldn't be any rebuilding going on.
Please don't rely on this or other info from forum members, informed as it may be. The controller manufacturer's documentation has to be consulted and treated as authoritative.
longjohn
05-23-2007, 11:07 AM
The odd thing about this is usually the failed or pre-failed drive would generate a warning and the RAID driver would prompt you to replace it before rebuilding. The fact that it shows as rebuilding indicates the RAID controller thinks it's a replacement drive. Ordinarily a rebuild doesn't start until the a new drive appears in place of the failed one, or what the controller thinks is a new drive. It may have been the shutdown that caused the system to think a new drive was in there. I would seek the advice of the manufacturer, e.g. tech support.
The odd thing about this is usually the failed or pre-failed drive would generate a warning and the RAID driver would prompt you to replace it before rebuilding. The fact that it shows as rebuilding indicates the RAID controller thinks it's a replacement drive. Ordinarily a rebuild doesn't start until the a new drive appears in place of the failed one, or what the controller thinks is a new drive. It may have been the shutdown that caused the system to think a new drive was in there. I would seek the advice of the manufacturer, e.g. tech support.
Ooops--didn't make myself clear. This was in a configuration screen that I entered manually at bootup, like CMOS. I don't think it was actually rebuilding. I think what it was doing was prompting me to *start* rebuilding. Having already been bitten once in this game, I decided to maintain the status quo until quite sure of what I was doing.
longjohn
05-23-2007, 01:09 PM
Ooops--didn't make myself clear. This was in a configuration screen that I entered manually at bootup, like CMOS. I don't think it was actually rebuilding. I think what it was doing was prompting me to *start* rebuilding. Having already been bitten once in this game, I decided to maintain the status quo until quite sure of what I was doing.
Well, before doing anything you need to determine whether that's the case: whether drive 0 is idle (whether "failed" or not) and the controller waiting on a replacement, or if drive 0 is actively being rebuilt. That will determine what you do next. Your documentation should tell you how to figure this out.
DaveN
05-23-2007, 01:34 PM
You are getting excellent advice from longjohn. An abundance of caution is the minimum that you should apply when dealing with RAID. One false move and you lose the entire array.
At least you are using RAID-1. I used RAID-0 on my first attempt, not realizing that the loss of a single drive would kill the entire array with no hope of recover.
You are getting excellent advice from longjohn. An abundance of caution is the minimum that you should apply when dealing with RAID. One false move and you lose the entire array.
You betchum, and I do appreciate it (from both of you)! I well know about that "one false move" bit; when I set the array up initially, I'd already been copying CDs to one of the drives for a couple of months. In the space of maybe a minute, all was gone. What's at risk *now* is what I've managed to redo in about the past month. Let's just say, I do *not* want to copy and label Bach's complete organ music for a *third* time!
I'll need to dig to see what kind of documentation I have for my (extremely cheap) controller card. If memory serves, it was kinda minimal. Unfortunately, my budget for this project has been next to nil, probably one reason I've been having so many problems with it....:sigh:
At least you are using RAID-1. I used RAID-0 on my first attempt, not realizing that the loss of a single drive would kill the entire array with no hope of recover.
Ouch.
vBulletin® v3.7.2, Copyright ©2000-2012, Jelsoft Enterprises Ltd.