RAID-6 fail - what are my options? :(

dulcificum

n00b
Joined
Nov 22, 2012
Messages
55
This thread is gutting but let's get on with it.

To start with, I'll get all the obvious stuff out of the way. Yes, I should have had backup. Yes I understand what proper backup is. Call me an idiot all you want - I accept it. Of course I would go back and buy 5 TiB of cheap disks to have another copy of my data. I backed up about 500 GiB of the data but not most of it. Bits and pieces were not properly backed up.

To this extent, everything is my own fault. I've expereinced a few catastrophic data failures over the years but this is by far the worst. Also, I know almost nothing about linux software RAID so I'll sound woefully ignorant. Anyway, I take responsibility but, for now, I just want to do the best I can and I really could do with some help.

Here's what happened:

I had 5 x 1.5 TiB Samsung drives in a QNAP-639 Pro.
They were in an ext3 RAID-6.
I added a new, sixth, 1.5 TiB Samsung disk to the array using the admin panel.
It started adding but, about 24 hours in I came back to find the QNAP had reset itself.
There were no power issues - I hadn't had a power cut and it was on a UPS but the unit wouldn't respond to anything. Not even ping.
QNAP guys told me to reset the firmware.
No dice - superblocks were missing on drive 2, 3, 4 and 5.
I handed them over ssh access making it very clear not to do any write operations so I could attempt my own data recover if the shit really hit the fan.
They ran commands for three days on my NAS box. Mainly it seemed to be e2fsck. The connection dropped a few times during e2fsck commands which they claim messed it up even more.
From what I understand, this borked it even more.

So this is where I am now. 4.5 TiB of data down the drain. A lifetime of files lost. Obviously, I bought the QNAP and used RAID-6 to try and keep this stuff an infinitum. Obviously I was aware that RAID != backup but it was the best I could afford.

Now what do I do? I tend to think my best bet is R-Studio or something similar? But I'm not sure how to go about it...

First things first - I guess I need to buy 6 USB HDD enclosures. And a USB hub. Plug all HDDs into a PC. Buy R-Studio and run it on the same PC. Then what? Are there fairly straightforward data recovery options?

Obviously, there are A TON of files I want to get back. But even getting a directory tree would be incredibly useful. What are my best hopes in this situation? What hardware do I need to buy to make an attempt at this?

Or is it just a lost cause? Should I format the drives, declare data zero and start my life again?

As any of you who have any idea how I feel know, I'd massively appreciate ANY help or advice. Any info you need or any commands I should run, please let me know. I'm out of my depth and completely lost here. I'm listening.

Wah wah wah.
 
Since you speak of USB enclosures, I'm guessing you only have laptops on hand, is that the case ?

Here people will recommend first to buy new hard drives to clone your current ones on, then work on the clones.
 
As said your best bet is getting a pro recovery service (not geek squad;) ). One issue is raid6 actually makes data recovery even harder but not impossible.

Sorry for your loss. Try to use this post to explain to others why raid is not backup. I highly recommend an online service like Crashplan, one price unlimited storage. I lost a good bit of data once and now I get nervous if I do not have at least 2 backups of the working file.
 
Sorry to hear about your catastrophe, thats a lot of data to lose. I also have a Qnap 639pro, I just recently upgraded from 2TB to 3TB WD Red drives. Over a year and a half ago I experienced a bit of an issue with my Qnap that scared the crap out of me, so I bought some drives and enclosures, threw them in an Antec 1200 along with a perc 5i raid controller and duplicated the data from my Qnap. I know not everyone can afford to do that, but it highlights what a lot of people don't understand, which has been mentioned enough times in this thread, RAID is not a backup. One easy way to backup data stored on a NAS, is to buy a small inexpensive program called Second Copy, install it on a PC and set it to copy data from the NAS to the PC and then back that up to an external drive or flash drive if the data is not too big, or to the cloud. Hope you get all or some of your data back.
 
Thanks for your answers guys. Still struggling to come to terms with this :(

Not sure whether to waste 100s of hours and £100s trying to get nothing back or just call it quits and start again from scratch :( :(

Since you speak of USB enclosures, I'm guessing you only have laptops on hand, is that the case ?

Here people will recommend first to buy new hard drives to clone your current ones on, then work on the clones.

I have a PC and a laptop. But I don't have many bays in the PC and I don't want to mess about with drives hanging out - I thought USB enclosures would be the most straightforward option.

Clone drives - sure. Or I will just work from them onto new media. I'm just wondering about the best logistical approach. I assume all drives need to be connected simultaneously and in the right order. I only have a could of USB ports - could I just buy 6 enclosures + backup drives + a couple of hubs? Anything else I'd need?

If you want to spend money on it, there are professional recovery services.

I don't have anywhere near that kind of cash. I can drop £100 or so - I'm guessing this would be a five figure job...

Wait qnap ran fsck via remote connection and did not nohup it. Wow.

I imagine as much. I thought it did that by default but I guess not.

Or at least run it in screen, that is malpractice on an unknown remote connection.

Do I have any recourse?

Oh and as for cloud backup - it isn't really an option. I get about 1-2 Mbps down and less up - it would take literally a year to back my stuff up..
 
Last edited:
Not knowing what device they ran e2fsck on is not good (did they go for /dev/dm-0 or /dev/sdaX). That being said having e2fsck drop like that is even worse. Ugh

You can try this below.

https://raid.wiki.kernel.org/index.php/RAID_Recovery

With that said something tells me there might be a problem with the QNAP unit. Degraded drives shouldn't cause a kernel panic. There's no easy answer here.
A) You need to get the drive locations and order.
B) Forgo the USB hub and enclosures. USB isn't meant for RAID. Either buy a cheap computer with enough ports, build your own NAS, or get the QNAP unit back online in a reliable state. I would go with buying a cheap computer or building my own NAS personally. **
C) Get the drives reconnected properly and see if you can recreate the RAID array while salvaging your data.
D) Pray....a lot but keep a drink handy.

** Personally I don't believe in off-the-shelf non-enterprise NAS units and I hate it when people recommend them. You don't know if the PSU is made in China or the MB is made in Uzbekistan. It's best to either buy an OEM certified enterprise grade unit (which comes with a nice warranty), or build it yourself. Having your NAS unit reboot itself isn't good period, Just thinking about it makes my blood pressure rise.
 
Last edited:
Cheers. They used /dev/md0

They also tried all the stuff in that link but it didn't work because (I think) four of the superblocks were missing.

Is this a lost cause then? I don't I won't get 100% of the info back but can I even hope for 50%? 10%?
 
Cheers. They used /dev/md0

They also tried all the stuff in that link but it didn't work because (I think) four of the superblocks were missing.

Is this a lost cause then? I don't I won't get 100% of the info back but can I even hope for 50%? 10%?

Do you know if they tried pulling the alternate superblock information before e2fsck?
 
Not knowing what device they ran e2fsck on is not good (did they go for /dev/dm-0 or /dev/sdaX). That being said having e2fsck drop like that is even worse. Ugh

You can try this below.

https://raid.wiki.kernel.org/index.php/RAID_Recovery

With that said something tells me there might be a problem with the QNAP unit. Degraded drives shouldn't cause a kernel panic. There's no easy answer here.
A) You need to get the drive locations and order.
B) Forgo the USB hub and enclosures. USB isn't meant for RAID. Either buy a cheap computer with enough ports, build your own NAS, or get the QNAP unit back online in a reliable state. I would go with buying a cheap computer or building my own NAS personally. **
C) Get the drives reconnected properly and see if you can recreate the RAID array while salvaging your data.
D) Pray....a lot but keep a drink handy.

** Personally I don't believe in off-the-shelf non-enterprise NAS units and I hate it when people recommend them. You don't know if the PSU is made in China or the MB is made in Uzbekistan. It's best to either buy an OEM certified enterprise grade unit (which comes with a nice warranty), or build it yourself. Having your NAS unit reboot itself isn't good period, Just thinking about it makes my blood pressure rise.




Have you ever owned an off the shelf NAS? I believe in them, I've owned three of them for that past four years, and but for one power supply failing they have been trouble free. Sure you can hand pick the parts and build your own but then it usually requires some linux knowledge if you want to have something at least as good as an off the shelf Qnap, Synology, Netgear or Thecus. Off the shelf NAS's have come a long way in recent years and while there is even quite a difference in what you get for your money amongst the brands I have listed, they all work very well.
 
With that said something tells me there might be a problem with the QNAP unit. Degraded drives shouldn't cause a kernel panic. There's no easy answer here.
A) You need to get the drive locations and order.
B) Forgo the USB hub and enclosures. USB isn't meant for RAID. Either buy a cheap computer with enough ports, build your own NAS, or get the QNAP unit back online in a reliable state. I would go with buying a cheap computer or building my own NAS personally. **
C) Get the drives reconnected properly and see if you can recreate the RAID array while salvaging your data.
D) Pray....a lot but keep a drink handy.

I don't think there was a kernel panic. And the array was never degraded. If it was I wouldn't have had an issue.

A) I can do this easily.
B) This isn't really an option for me - way too time/money intensive. If it's possible to do it with my existing PC instead that would be massively prefereable. I won't be able to run R Studio on the drives in the QNAP unfortunately so I need them on my main PC.
C) Once I have them all connected to one computer, is this pretty easy with something like R Studio?
 
Once I have them all connected to one computer, is this pretty easy with something like R Studio?

mdadm has the recovery options builtin. Although I am not sure about recovering a partially expanded array. I have never tested that. I have recovered raid6 arrays with 3 or more drives kicked out of the array without much difficulty but never partially expanded.
 
I imagine it hung between adding the drive and expanding the capacity to the new size although there doesn't seem to be any log to verify this.

Can you briefly talk me through your method or how I could best do this without having to spend much more than on 6 enclosures, USB hubs and new disks?
 
best of luck, do update the thread if you recover.

did you attempt to expand before the raid6 rebuild was done after adding the new drive?
 
I'll investigate more what I need to buy (please keep recommendations coming as I'm essentially completely lost now) and see what I can do in the next couple of weeks.

I don't know where it got to. I assume it did the expansion but didn't add the new drive. They instruct you to use the admin panel which should automatically do the expand step after the drive has been added but I have no way of finding out how far in the process it went.

I think QNAP said something about only five drives being there when they started trying to recover it but I may be wrong.
 
Have you ever owned an off the shelf NAS? I believe in them, I've owned three of them for that past four years, and but for one power supply failing they have been trouble free. Sure you can hand pick the parts and build your own but then it usually requires some linux knowledge if you want to have something at least as good as an off the shelf Qnap, Synology, Netgear or Thecus. Off the shelf NAS's have come a long way in recent years and while there is even quite a difference in what you get for your money amongst the brands I have listed, they all work very well.

That's just my personal opinion. If you want to buy off the shelf NAS solutions by all means go for it.
 
I don't think there was a kernel panic. And the array was never degraded. If it was I wouldn't have had an issue.
When you have missing superblocks mdadm can't determine what RAID you were running or the version installed on the drive. Trust me the array is degraded in every sense of the word.

B) This isn't really an option for me - way too time/money intensive. If it's possible to do it with my existing PC instead that would be massively preferable. I won't be able to run R Studio on the drives in the QNAP unfortunately so I need them on my main PC.
If you don't have a choice then USB hub it is. Although in the future please remember that what seems to save you time and money in the beginning can sometimes bite you in the rear later.

C) Once I have them all connected to one computer, is this pretty easy with something like R Studio?
I've never used it.
 
That's just my personal opinion. If you want to buy off the shelf NAS solutions by all means go for it.

I understand, I am just trying to understand your 'hate' towards them, what is it based on? What you've read or what you've experienced? Not trying to pick a fight, just curious.
 
I understand, I am just trying to understand your 'hate' towards them, what is it based on? What you've read or what you've experienced? Not trying to pick a fight, just curious.

I wouldn't call it hate. It's more of a desire to not repeat history. When I used to work for small-businesses I always would encounter the off the shelf NAS. In every case there was always something up. Either it didn't support drive bigger than X, or it would have some annoying behavior. I've seen quite a few die as well. After being called into work at all sorts of hours in the morning to rebuild something I advised should be replaced months ago, I developed a visceral reaction to cheap (translation: ill-suited for the job) storage solutions.

I'll admit it I'm lazy. I don't like opening up boxes or calling technical support. In fact my goal is to never wait on hold to talk to someone with an unusually thick accent ever again.

Scenarios such as these are avoidable. Therefore I usually recommend people to spend the money when it comes to storage. You save tons of time in the long run. If something goes bad it's either easily replaceable or there's a nice warranty that will do the replacement for you.

I never quite understood the logic:

Put the top of the line Intel chip in an HTPC? Check!
Buy the top of the line video card for resolutions your monitor doesn't even support? You damn right. I'm on it.
Buy a decent HBA which will lay the ground work for all of your most sensitive data? Naw f@$! that! Gimme that on-board $13 shit. I'm strapped for cash.
 
Last edited:
When you have missing superblocks mdadm can't determine what RAID you were running or the version installed on the drive. Trust me the array is degraded in every sense of the word.

But it might still be recoverable?

If you don't have a choice then USB hub it is. Although in the future please remember that what seems to save you time and money in the beginning can sometimes bite you in the rear later.

Tru dat!

So have you done something like this before with RAID-5/RAID-6?

Let me get my two options clear:

1) Cheap and dirty
  • 6 USB enclosures
  • USB hubs or no?
  • Connect to PC, identify and run R Studio

2) Better way - Build a box around the drives
  • Whole new PC including mobo, PSU, boot drive, SATA card, etc.
  • Insert drives
  • As above, run R Studio

Would both these options work?
What are the main advantages of #2?
Can anyone estimate my chances of recovering 5%/10%/50% of the contents??
 
I won't pretend to be familiar with that particular brand of NAS, but Linux software RAID is something near and dear. I've definitely suffered my fair share of failures, and I can relate to your frustrations.

About a year ago, a few of my friends and I picked up a subscription to CrashPlan, and it works wonderfully for our purposes. As you mentioned though, you don't feel you have the bandwidth to support this solution.

Regarding your drives... you can always try to force an assembly with the drives in random order. I've done this with varying degrees of success. Keep in mind, what I'm suggesting is by no means a "safe" solution. Given the limited budget though, your safety net is likely to have a few elephant sized holes. =)

If you want to simply recover the data, it doesn't matter much whether you go with USB enclosures or plug the drives directly into a machine. For a long term solution, USB-connected drives would be far from a recommendation.
 
Yeah, this would never be a long term solution. I'd just back everything I can get up onto some huge spare drive (might use eSATA for that or my USB bandwidth will be suffering) and then put them back in the NAS.

With all the fuckenation this array has taken from the e2fsck and god know what else the QNAP people did to the drives, is it at all feasible that I'll find anything on them? Or is RAID too picky for this to be viable?

If anyone else has had any degree of success using R Studio to recover highly-borked striped RAID arrays, I'd love to hear. Thanks again everyone who's helping out in this difficult time :(
 
@OP

For the recovery, go with the cheap option since you have the disks already, but going forward you'd be better off building your own NAS.

Personally, I've only ever lost data Linux software RAID once, and I did it in style - 10TB worth of style. Luckily, most of what I lost was either easily replaceable or not very important, i.e. media. What happened was that the RAID superblocks got corrupted so that the array didn't know what RAID level it was supposed to be. The fact that the drives had been physically moved around probably didn't help either.

It would have been helpful if the RAID subsystem at least kept historical backups of the RAID superblocks, but what's past is past. :)
 
If you don't have any luck with the other methods, you can always try recreating the array from the command line as a last-ditch effort.

Something like:
Code:
mdadm -C /dev/md0 --chunk=512 -l6 -n5 -x0 /dev/sd[abedc]

Of course it helps to know:

a) the original drive order, and
b) the chunk size.

Can't say that it will work for you, but I was able to resurrect my 8-drive RAID-6 array from a botched operation long enough to copy the critical stuff off.

Best of luck to ya!
 
Two things, this is why I run ZFS the off the shelf non-enterprise NAS systems are junk, other than drobo but I have only had limited experience with drobo.

Next, I would close the drives and build a recovery machine not USB! I lost 18TB of data but it was due to an idiot female stealing both my primary and secondary system. I had a hot 24TB spare!

Check out NAS4Free when you get the time, makes ZFS easy, and you can get crazy reliability when I have my data on ZFS I don't worry about a thing, I snapshot my files every 15 minutes etc.

Cheers!
 
If you don't have any luck with the other methods, you can always try recreating the array from the command line as a last-ditch effort.

Something like:
Code:
mdadm -C /dev/md0 --chunk=512 -l6 -n5 -x0 /dev/sd[abedc]

Of course it helps to know:

a) the original drive order, and
b) the chunk size.

Can't say that it will work for you, but I was able to resurrect my 8-drive RAID-6 array from a botched operation long enough to copy the critical stuff off.

I'm pretty sure this is stuff the QNAP people repeatedly tried but it failed due to the superblock issue.

Two things, this is why I run ZFS the off the shelf non-enterprise NAS systems are junk, other than drobo but I have only had limited experience with drobo.

I know a little about ZFS but what extra safeguards (not extra flexibility/features) does it offer that would be relevant in this instance?

Next, I would close the drives and build a recovery machine not USB! I lost 18TB of data but it was due to an idiot female stealing both my primary and secondary system. I had a hot 24TB spare!

What is the issue with using USB enclosures just to image the drives?
 
You asked about ZFS features, copy on write, snapshots, checksums oh, data has never been lost due to a powerfailure or corruption. Check the wiki out here http://en.wikipedia.org/wiki/ZFS

I don't trust the USB controllers, also speed issue.
 
Sorry for my ignorance - what specifically do you not trust about them? It's not possible to daisy chain SATA/e-SATA is it?

If I do go USB, will it be okay with USB hubs?
 
I have had various issues with USB over the years, and prefer to use dedicated. No, it isn't possible without extra controller cards.
 
If I wasn't going to go USB:

I run an ITX system on a Asus P8H67-I Deluxe with one boot drive. So I have three SATA ports + 1 eSATA port free. Could I just slot in a cheap 4 port SATA card into the PCIe and use that? As it's ITX, I'll just have the drives airmounted hanging out of the case :)

A 400W PSU should be fine for this, right? I use onboard graphics.
 
For drive imaging/recovery id go SATA only. I have used USB2 but it is a lot slower. That was with a single drive that I was imaging using R-Studio.

Hopefully you can get your data back.
 
The problem here is that you cannot even tell if the online capacity expansion was complete or not. The process rearranges almost all data on all drives and if interrupted somewhere can leave a lot of the data totally garbled. I wonder how the Qnap guys did an fsck when they couldn't even reassemble the raid. Best thing you can do now is to make copies of all drives and work from there and reassemble the devices in all configurations possible. If the on-disk format changes halfway through the disks I'm not sure if there is a tool that can help you.
 
At this point, with so much unknown about the state of your array members, your best chance for recovery is not to screw with it and go to a professional recovery service such as Drivesavers.
 
Unfortunately that's really not an option :(

Then you can try some of the options that have been mentioned above, but before you do ANYTHING please make images of the original drives (best to another set of identical drives, or at least make logical image files) so no matter what you do you can always go back to the condition it is in now. How important is the data that was on the array?
 
I will image the drives. But is that even really necessary if I'm restoring to another drive? - I won't be making any writes at all on the original drives.

It's hard to tell how important it was without a directory tree. It's not anything commercially important and is mainly personal documents, photos, music, art, etc. But I had a lot of good memories going back 10+ years on there which now might be gone forever. I don't mind if I lose the DVD rips but I'd be devastated if all of it is gone.
 
I will image the drives. But is that even really necessary if I'm restoring to another drive? - I won't be making any writes at all on the original drives.

It's hard to tell how important it was without a directory tree. It's not anything commercially important and is mainly personal documents, photos, music, art, etc. But I had a lot of good memories going back 10+ years on there which now might be gone forever. I don't mind if I lose the DVD rips but I'd be devastated if all of it is gone.

With the luck you have had so far, the absolute last thing you need now is any member drive failing. Make the images as soon as you can so you have them. You can then try and recover data but like you have heard, no one knows what condition the filesystem is in and if it is recoverable at all at this point. Since you don't plan on going with a professional recovery service at this point you have nothing to lose (as long as you have images you can go back to).
 
Back
Top