Grease and hard drive change

Originally published 2007 in Atomic: Maximum Power Computing
Last modified 03-Dec-2011.

 

You know how mechanics put a little sticker in the corner of your windscreen to remind you when your car will need another service? Hard drives should come with something similar.

Because, one way or another, all hard drives are going to die.

Personally, I start feeling nervous about my drives when they hit their second birthday. Since they've then spent almost all of those two years cheek-by-jowl with other drives in the disk-farm PCs I favour, this is not entirely irrational. But, thanks to a couple of recent studies, I now know that it's less rational than I thought it was.

Google's study (PDF) of more than a hundred thousand drives over five years is useful as much for what it says about how hard it is to figure this stuff out, as for what it actually found.

It turns out that working drives hard, or running them warmer than recommended, doesn't seem to have much of an impact on their life. And the popular idea that failures follow a "bathtub curve", in which any drive that doesn't die in the first three months is likely to live for five years, also seems to be invalid. Drives actually just slowly wear out over their lives, like other mechanical devices.

The Google study found, as you'd expect, that S.M.A.R.T. errors are strong predictors of an imminent drive failure. But, overall, S.M.A.R.T. was actually close to useless; any well-used hard drive will have at least a couple of S.M.A.R.T. warning flags (just based on the accumulated on-time hour count), yet 36% of Google's dead drives hadn't shown any S.M.A.R.T. warnings at all.

I found this confusing when I first wrote this piece, since there's absolutely no technical reason why S.M.A.R.T. shouldn't be able to warn of many, if not most, imminent drive failures. Then I read this Usenet post, which alleges that the hard drive manufacturers' marketing departments just overruled the engineers and made them, in essence, turn off S.M.A.R.T.'s early warning features, without telling anyone.

I don't know whether this is actually true, but it certainly fits the evidence. Thanks a lot, marketing guys! Yet again, you've made the world just that little bit more awful!

(I wrote a bit more about this in this blog post. This other post, about failure rates for individual drives and RAID arrays, is also marginally relevant.)

Aaaanyway, a similarly huge Carnegie Mellon University hard drive study reached the same conclusion as the Google one.

And both studies also found that we're not all hallucinating - the very long lifespans indicated by hard disk manufacturers' "million hour" mean time to failure figures do not, in fact, indicate that any real drive is likely to last for anything like a million hours of operation (that's 114 years!).

MTBF numbers aren't meant to be taken as an actual lifespan estimate, as I explain in this old piece (it's also covered in the Wikipedia Mean Time Between Failures article), but that's not what we're talking about here. Even using standard MTBF analysis, drive failure rates are much higher than the manufacturers glibly allege.

Annual replacement rates (ARRs) for hard drives are usually specified by the manufacturers as being well below 1%. But the real ARRs can actually be above ten per cent.

If you've got a 0.5% ARR drive (pretending, for simplicity, that the rate doesn't change over time; actually, the probability of failure rises as the drive gets older) then it's very likely (97.5%) to still be alive after five years.

If you've got a 5% drive, though, then there's a 23% chance it'll die in its first five years.

And if you're unlucky enough to have bought a 12% drive, then it's more likely to be dead after five years than it is to be alive.

You can only figure these numbers out for a given model of drive at the end of the period, of course. Different drive models vary in reliability, and despite the devout beliefs of various geeks about different manufacturers' products, there's actually no detectable relationship between brand and reliability.

(Which, yes, does mean that there's a strong case for buying drives based on price, even if you've always sworn by Brand X on account of those two Brand Ys that died on you in quick succession.)

On balance, though, I now think it's perfectly reasonable to give your drives three years before you replace them. Four years is OK for penny-pinchers, and five years or more should be fine for Aunt Nora's PC that's only turned on for two hours a week.

When you reach the end of that period, it's time to put aside an evening for plugging fresh drive(s) into the computer and imaging the old drive(s) across, using some flavour of Norton Ghost or the freeware DriveImage XML or something. XXCLONE is also free for personal use, and looks pretty neat.

This procedure gets slower with each passing year, as drives increase in capacity much faster than drive interfaces increase in bandwidth. But it's still easy enough to do.

If you've got more than one hard drive, by the way, and your update leaves you with your drives in a different order - say you upgraded your old Parallel ATA C drive to a new SATA one, or just muddled the connectors up when you swapped new for old - then you're likely to get a terrifying boot error when you reset the computer. Windows, for instance, is likely to bleat about not being able to find HAL.DLL.

Don't panic. That error just means your new boot disk is not first in the boot order any more. Fix that in your motherboard's BIOS setup program (usually accessed by pressing Delete after the POST beep) and you'll be away.

Oh, and if you've got a multiple-hard-drive computer, it also pays to label the drives on whatever side faces out when the case is open, so you know which one is which without playing deductive games.

I use a silver pen.

Stop laughing.

Other columns

Learning to love depreciation

Overclockers: Get in early!

Stuff I Hate

Why Macs annoy me

USB: It's worth what you pay

"Great product! Doesn't work!"

The virus I want to see

Lies, damned lies and marketing

Unconventional wisdom

How not to e-mail me

Dan's Quick Guide to Memory Effect, You Idiots

Your computer is not alive

What's the point of robot pets?

Learning from spam

Why it doesn't matter whether censorware works

The price of power

The CPU Cooler Snap Judgement Guide

Avoiding electrocution

Video memory mysteries

New ways to be wrong

Clearing the VR hurdles

Not So Super

Do you have a license for that Athlon?

Cool bananas

Getting rid of the disks

LCDs, CRTs, and geese

Filling up the laptop

IMAX computing

Digital couch potatoes, arise!

Invisible miracles

Those darn wires

Wossit cost, then?

PFC decoded

Cheap high-res TV: Forget it.

V-Pr0n

Dan Squints At The Future, Again

The programmable matter revolution

Sounding better

Reality Plus™!

I want my Tidy-Bot!

Less go, more show

In search of stupidity

It's SnitchCam time!

Power struggle

Speakers versus headphones

Getting paid to play

Hurdles on the upgrade path

Hatin' on lithium ion

Wanted: Cheap giant bit barrel

The screen you'll be using tomorrow

Cool gadget. Ten bucks.

Open Sesame!

Absolutely accurate predictions

The truth about everything

Burr walnut computing

Nothing new behind the lens

Do it yourself. Almost.

The quest for physicality

Tool time

Pretty PCs - the quest continues

The USB drive time bomb

Closer to quietness

Stuff You Should Want

The modular car

Dumb smart houses

Enough already with the megapixels

Inching toward the NAS of our dreams

Older than dirt

The Synthetics are coming

Pr0nBack!

Game Over is nigh

The Embarrassingly Easy Case Mod

Dumb then, smart now

Fuel cells - are we there yet?

A PC full of magnets

Knowledge is weakness

One Laptop Per Me

The Land of Wind, Ghosts and Minimised Windows

Things that change, things that don't

Water power

Great interface disasters

Doughnut-shaped universes

Grease and hard drive change

Save me!

Impossible antenna, only $50!

I'm ready for my upgrade

The Great Apathetic Revolution

Protect the Wi-Fi wilderness!

Wi-Fi pirate radio

The benign botnet

Meet the new DRM, same as the old DRM

Your laptop is lying to you

Welcome to super-surveillance

Lemon-fresh power supplies

A>B>C>A!

Internet washing machines, and magic rip-off boxes

GPGPU and the Law of New Features

Are you going to believe me, or your lying eyes?

We're all prisoners of game theory

I think I'm turning cyborg-ese, I really think so

Half an ounce of electrons

Next stop, clay tablets

A bold new computer metaphor

Won't someone PLEASE think of the hard drives?!

Alternate history

From aerial torpedoes to RoboCars

How fast is a hard drive? How long is a piece of string?

"In tonight's episode of Fallout 4..."

How hot is too hot?

Nerd Skill Number One

What'll be free next?

Out: Hot rods. In: Robots.

500 gig per second, if we don't get a flat

No spaceship? No sale.

The shifting goalposts of AI

Steal This Education

Next stop: Hardware piracy

A hundred years of EULAs

The triumph of niceness

The daily grind

Speed kings

Alt-tCRASH

Game crazy

Five trillion bits flying in loose formation

Cannibalise the corpses!

One-note NPCs

Big Brother is watching you play

Have you wasted enough time today?

The newt hits! You die...

Stuck in the foothills

A modest censorship proposal

In Praise of the Fisheye

Filenames.WTF

The death of the manual

Of magic lanterns, and MMORPGs

When you have eliminated the impossible...

Welcome to dream-land

Welcome to my museum

Stomp, don't sprint!

Grinding myself down

Pathfinding to everywhere

A deadly mouse trap

If it looks random, it probably isn't

Identical voices and phantom swords

Boing!

Socialised entertainment

Warfare. Aliens. Car crashes. ENTERTAINMENT!

On the h4xx0ring of p4sswordZ

Seeing past the normal

Science versus SoftRAM

Righteous bits

Random... ish... numbers

I get letters

Money for nothing



Give Dan some money!
(and no-one gets hurt)