From the BlogSubscribe Now

The Single Point of Failure

As I started to write my blog post for last Fri­day, I found my RAID array was dead. Despite my best inten­tions, I fell vic­tim to the sin­gle point of failure.

It started off as a curios­ity. I came home one evening to find that my Drop­box account didn’t sync. Turned out to be due to a loss of com­mu­ni­ca­tion between my iMac and my RAID sys­tem. A flick of a switch and I was back in busi­ness, and I chalked it up to some ran­dom power fluctuation.

That was my first mis­take. The sys­tem is behind a UPS and noth­ing else seemed out of place. Even if the power had gone out, the UPS would keep things chug­ging along.

Trouble-​shooting the Problem

Later, I expe­ri­enced the com­mu­ni­ca­tion prob­lem between my RAID array and my iMac a few more times. Within two days, my RAID array sim­ply would not oper­ate. I thought of the most likely prob­lems and the worst case problems.

Of all the prob­lems to have, the nicest one would be due to a bad cable. Maybe the eSATA port I had installed in my iMac by OWC had gone bad. I tried switch­ing to using a FireWire cable and the sys­tem worked fine — for about two min­utes. Then it failed again. So much for the easy problem.

The next step was to con­tact the RAID enclo­sure ven­dor, Oyen Dig­i­tal. I didn’t want to tink around with this thing. Instead, I’d rather just buy another enclo­sure and move my disks into it. Every time it ini­tially started work­ing, I could see every­thing was in its place. There were no warn­ings of drive fail­ures, so I believe my data was safe. The prob­lem was with the enclo­sure itself.

The sup­port tech sug­gested that it could be due to a power sup­ply fail­ure. That was only $7.95, so I bought it and had it shipped overnight. $30 for overnight, but well worth it if this resolved my problem.

It didn’t.

You’d think my next step would be to buy another enclo­sure, but it’s not that sim­ple. Most RAID enclo­sures are pro­pri­etary to an extent because of the encod­ing chip inside. The com­pany that man­u­fac­tured my RAID enclo­sure has since upgraded its prod­uct and uses a dif­fer­ent chip. I can­not buy another enclo­sure, from this ven­dor or any other, that will work with the disks in my array. I have to send it back for repair, which means that I’ll be out of ser­vice for at least a week con­sid­er­ing ship­ping times, per­haps longer.

The Sin­gle Point of Fail­ure Strikes Again

I’ve spent most of my life in the Infor­ma­tion Tech­nol­ogy busi­ness. That means I know how to pro­tect data. Of course, it also means that I suf­fer from Voca­tional Irony — a pro­fes­sional who is unable to help him­self, much like how the cobbler’s chil­dren have no shoes.

  • I know that power out­ages can cause dam­age, so I have a UPS
  • I know that data gets cor­rupted, so I keep backups
  • I know that disks fail, so I have a RAID system

The prob­lem is there is only so much risk you can mit­i­gate or you spend all of your money try­ing to pre­vent the inevitable. I know that a RAID sys­tem is not invul­ner­a­ble, but I believed that one of the disks inside the array would fail before the enclo­sure itself failed. That’s where I was wrong. The enclo­sure, of course, is the sin­gle point of fail­ure in my sys­tem. It’s the Achille’s Heel. If it fails, all else fails.

Yes, I have back­ups of my pho­tos. Most of them. Some of the most recent images, I’m not so sure, but they’re still on my CF cards. That means I would have only lost my edits and meta­data, but the images are safely backed-​up somewhere.

Back­ups are never com­plete, though. My RAID sys­tem also holds a plethora of other data. Music, eBooks and movies are among the most preva­lent. Much of this I can get from iTunes, but not all of it. Then there are pro­grams that I’ve bought, spe­cific videos and other train­ing from indi­vid­ual sources. Again, I can down­load some of that, but not all of it.

I didn’t spend the money on a dupli­cate sys­tem for my backup. It was always some­thing I was going to do, but hadn’t done it yet. The fact is that this stuff costs money and I spent the last year laid-​off from my pre­vi­ous employer and work­ing to make ends meet, so the backup sys­tem I wanted was a lux­ury. At least it seemed so at the time. It seems like a neces­sity right now.

It’s Not Enough to Have a Backup

Hav­ing a backup, whether par­tial or com­plete, isn’t enough. You can’t run your sys­tem (in most cases) from a backup. You need to be able to restore that backup some­where. Until I get my RAID enclo­sure repaired or replaced, I have nowhere to restore my backups.

The res­o­lu­tion to this prob­lem is to spend money.

First, I need to get a larger backup sys­tem equiv­a­lent to all of my data files. The cur­rent sys­tem of a cou­ple of ter­abyte dri­ves is both inad­e­quate and inel­e­gant. Since my RAID array offers 6 ter­abytes of usable stor­age, I’ll get another 6 ter­abyte sys­tem for backup even though I’m not using all of that capac­ity. In fact, I may get a larger backup sys­tem because I fore­see my data stor­age needs grow­ing as I buy more dig­i­tal enter­tain­ment and because I’m now shoot­ing with a D800 that cre­ates 36mp files.

Sec­ond, I need to return my RAID enclo­sure to Oyen Dig­i­tal for repair. Of course, the war­ranty was good for two years and that ended last Novem­ber. Noth­ing of mine ever seems to break down while it’s under war­ranty. I’ve no idea how much that will cost. Con­sid­er­ing that the enclo­sure itself sells for under $300, I’m hop­ing the repair will be under that price.

Finally, I need to buy a new stor­age sys­tem. This one is really going to cost me money. Repair­ing my old enclo­sure just buys me time. It’s already proven that RAID sys­tems can and do fail. I’m not sure if Oyen Dig­i­tal will be able to repair it in another cou­ple of years, as tech­nol­ogy moves for­ward and the parts nec­es­sary to repair it may become unavail­able. Sus­tain­ing a tech­no­log­i­cal prod­uct beyond it’s end of life is unwise.

Ide­ally, I’d like to have a SAN. If money were no object, that’s what I’d buy for my home as I’m also in the process of pur­chas­ing this tech­nol­ogy in my day job. Real­is­ti­cally, I just can’t afford it. I know I’m going to buy some­thing else that will also have a sin­gle point of fail­ure and ulti­mately die.

Deal­ing with Setbacks

If there’s a moral to this story, it’s that prob­lems arise despite your best prepa­ra­tions and inten­tions. All you can do is accept the prob­lems and deal with them. Although I’m both­ered by the enclo­sure fail­ure, I also real­ize that my data isn’t gone. Repair­ing the enclo­sure is the next step, but not the last pos­si­bil­ity. If the folks at Oyen Dig­i­tal find there is some rea­son they can­not repair my enclo­sure (which I think is unlikely at this stage), then there are data recov­ery folks who could get my infor­ma­tion and move it to another drive. Now that would be costly, but it’s yet another option.

As I’ve shared this story on social media over the past cou­ple of days, the inevitable hap­pened. Peo­ple crawl out of the wood­work to tell you all the things you did wrong, why you should never buy a pro­pri­etary RAID sys­tem, and other stuff like that.

I ignore those people.

First, telling some­one what they did wrong isn’t really help­ful in a sit­u­a­tion like this one. It’s more likely those folks just like to use another person’s mis­ery to make them­selves feel supe­rior. Besides, I already know what I did wrong and I accepted the risks asso­ci­ated with it.

Sec­ond, just try to avoid get­ting a pro­pri­etary RAID sys­tem these days. Peo­ple warn about Drobo as a pro­pri­etary sys­tem, but so are the other enclo­sures on the mar­ket. By that, it means that the dri­ves can­not be yanked out of an enclo­sure from one ven­dor and put into the enclo­sure of another ven­dor as it noth­ing else hap­pened. It all boils down to the chip used to encode them and those chips vary. There are some folks who will build their RAID out of Linux sys­tems, but that’s just a geeked-​out conun­drum of its own. Most peo­ple don’t want to build such a sys­tem, it takes up quite a bite more space, runs much slower and uses more power. It isn’t bet­ter. It’s just a dif­fer­ent kind of prob­lem in itself.

Finally, there’s risk in every­thing. We all have dif­fer­ent ways of deal­ing with it. I’m actu­ally still happy that I used this RAID enclo­sure because it still saved my data from com­plete loss. Get­ting access to it may be cum­ber­some right now, but my data isn’t blown away. That lit­tle thought tells me that I didn’t really do every­thing wrong. In a week or so, I’ll be chug­ging along just fine after spend­ing money on some new disk system.

There is still light at the end of the tunnel.

Launch of Space Mountain at Walt Disney World's Magic Kingdom

Embed This Image On Your Site (copy code below):

About William

Author, Photographer and IT Manager. I have a fondness for chocolate. I also own Suburbia Press and Aperture vs Lightroom.

  • http://www.facebook.com/al.kawasaki.9 Al Kawasaki

    Hope you get up to speed soon. I’ve hummed that tune “some­day I’m going to…” Money is what pre­vents me from have the “right” data stor­age solu­tion. Had a 600GB RAID 5 a few years ago. Switched to exter­nal dri­ves, back­ing up to exter­nal dri­ves. Would like to have a 6TB stor­age sys­tem and a match­ing or larger backup system.

    • http://www.orlandolocal.com William Beem

      All I can do is urge you to just do it. Money never seems well-​spent until it’s saved your bacon. I’m run­ning again now and the backup drive is hum­ming to get caught up. I’ll relax when the data is on both systems.