Atomic I/O letters column #124

Originally published 2011, in Atomic: Maximum Power Computing
Reprinted here December 11, 2011

Last modified 16-Jan-2015.

Brain transplants

Is swapping the controller board on a dead hard drive to try to recover the data as bad an idea as the data recovery companies claim?

A friend of my wife's has an Asus netbook, on which she stores everything, and for which she (of course) has no backup whatever - as you say, any data you have which isn't backed up is data you don't want to keep. And of course, the 2.5" hard disk in it gave up the ghost, so she came to me for help.

Powering up the netbook resulted in a delay and "no hard disk detected"-type messages. Pulling it out I found it was a SATA disk, so I stuck it into a drive dock attached to my main machine (which runs Linux). The drive is identified but times out or responds with garbage to commands sent to it.

The drive is spinning and sounds fine (not varying in speed/etc), so I figured it was the electronics that had gone bad. The friend isn't willing to spend what the professional data recovery companies want to recover it (seems $600 or so is the going rate if it doesn't need the clean-room treatment) but would like to try to get digital photos off of it.

I searched eBay for an identical drive (Seagate ST9160310AS, 160GB from their Momentus 5400.5 series) with the same model, PN (part number?), and firmware version and found one for not much, so it's on its way to me. I've not replaced the board on a hard drive before, though I used to build computers for a living and have disassembled/destroyed enough hard disks for paranoid customers looking for proof their data is wiped that it seems simple enough.

In searching for prices on data recovery, I found a number of firms with advice against swapping controller boards on these drives - though clearly they have a vested interest in you not saving your data yourself. I was wondering, is there anything to these warnings? One reason these warnings sound like marketing to me is that frequently these companies will have exactly the same text on pages for essentially every hard drive model number in existence, like in a link farm.

Examples:
"Please note, that in many cases, a simple PCB swap will not fix the hard drive. Most circuit boards are unique to their hard drives! There are extra steps that have to be taken to insure that a foreign PCB is able to adapt with another hard drive. Some of the steps are impossible without experience and professional equipment."
"The controller board of most drives stores unique adaptive data that can only be associated with the drive that it was originally a part of. In other words, it's very seldom that you can just swap out a controller board from one drive to another and get the dead drive functioning again."
"It is essential that you do not attempt to replace a faulty controller board with one from a working drive. Information on the board may be specific to the drive it has been attached to and may not function anywhere else."

Are these warnings just attempts to get people to use their services rather than try replacing the logic board themselves? And do you have any other advice on me attempting this other than taking ESD precautions?

Charles

Answer:
This question keeps coming around, and it's difficult to answer, mainly because hard-drive technology keeps getting more and more refined. To drag out the overused automotive analogy, modern hard drives are like modern cars, in that they're much more reliable than they used to be - vastly more reliable, if you're talking failures-per-megabyte - but more and more opaque when something goes wrong.

That said, it is definitely still possible to swap the controller board on a modern drive for an identical board - which very probably does mean the same revision and same firmware - and get a drive from which you can recover most, if not all, of the data. Note that this is only about recovering data, not resurrecting the drive; a modern drive with a new controller board on it probably won't be very reliable. If we're being charitable, that could be what the recovery companies are talking about when they say you can't "fix" a drive just by swapping in a new board. You should only expect the drive to be fixed enough for data recovery, at best.

Whether or not the board-swap makes the dead drive readable, after you recover whatever you can from it, you should rip that drive apart, as is traditional, and extract the fridge magnets and wall decorations.

The list of ways a board-swap can go wrong keeps getting longer. Even if you don't screw up in some elementary way like not taking anti-static precautions and annihilating the donor drive just by touching it.

A drive with a new controller board attached to it should, for instance, be expected to be significantly unhappy because of good old-fashioned bad-block mapping, where failing blocks on an otherwise healthy drive are invisibly replaced with standby blocks from a section of the drive set aside for this purpose. If the controller board keeps track of the block mapping, swapping in a new controller board can obviously cause problems. At worst, these problems can prevent you from even reading some data on the drive, let alone successfully writing to it.

There are more subtleties in recent drives, though. Like, the controller board may be calibrated at the factory to match the individual oddities of the particular head assembly in that one drive. The fantastically small tolerances and staggeringly high data densities in modern drives require a lot of this sort of subtle trickery.

Again, though, you may at least be able to read some of the drive provided your board is a close enough match. And if it doesn't work, you can at least put the board back on the donor drive and have that work.

(In a follow-up e-mail, Charles informed me that, annoyingly, the "dead" drive decided to work again for long enough for Charles to recover the photos from it without having to swap the board at all. Then he just put the was-to-be-board-donor drive into the netbook unmolested.)

Or buy a piano

I have a cat, called Boris. Boris walks on my keyboard. All the time. Sometimes I come back to the PC and I swear he's been trying to see if one cat can write Shakespeare faster than a million monkeys.

I usually deal with this by just opening an empty browser tab and leaving that as the active window. Boris hasn't figured out how to do Ctrl-Alt-Del yet, so about the worst he can do to a browser tab is sit on F1 and open hundreds of Google Chrome Help tabs. But there has to be a better way.

I tried a couple of "anti toddler" utilities that're supposed to lock your keyboard until you press some key combination that toddler hands can't reach, but they just... didn't work, at least not on 64-bit Win7. So rather than troubleshoot the software, I grabbed the chance to solve the problem in hardware!

Surely you could easily make a switch that interrupts the keyboard signal, for USB or PS/2. They both only have 4 conductors, and I bet you could just interrupt one data wire with a basic single-pole toggle switch. But I also remember being warned never to plug or unplug PS/2 devices with the computer turned on, because it could blow a motherboard fuse or something.

So could a $2 keyboard cable switch cost me $200 for a new motherboard? Do you need to switch all 4 wires?

Nicholas

Answer:
You could probably harmlessly disable a PS/2 keyboard by switching only one data pin; pin 1 on the DIN connector is data for a PS/2 keyboard, and pins 2 and 3 are data for USB. You can in theory blow a PS/2-port fuse by connecting and disconnecting the power and ground pins, but if you don't switch those you should be fine. USB is more robust, but you'll have to wait a couple of seconds between re-"connecting" your keyboard before you can use it.

Tell you what I'd do in this situation, though: Buy the finest, cheapest KVM (Keyboard, Video, Mouse) switch you can find, and just don't plug anything into one of its sets of sockets. Switch from your keyboard to an empty socket, and the keyboard's locked out.

The PS/2-and-VGA switch in the picture cost me $AU6.05 delivered on eBay. It runs from keyboard-port power, and works fine with my monster vintage IBM keyboard. Like other dirt-cheap KVMs, this one came with no cables, but all you need for this project is one PS/2 cable. That'll only cost you a couple of bucks delivered.

Going back to boring old software, there's also that "PawSense" thing that specifically detects "cat-like typing" (specifically, multiple adjacent keys pressed at once) and plays an alarming noise when it happens. But PawSense is commercial software, and nobody ever seems to have made a free version.

We never had these problems with punched cards

I have an external hard drive that I'm re-purposing, so I've taken all the data off of it and reformatted. I know that formatting always takes up some space, and there's some difference between the powers-of-10 the manufacturers use to calculate space versus the powers-of-2 the operating system uses (you've mentioned this yourself on one or two occasions... maybe three or four), but what I saw afterwards has me really confused.

The advertised disk space on the drive's sticker is 320Gb. I used the NTFS file system when I formatted it, and now Windows reports that there's 298Gb of available space. All good so far, and what I was expecting.

However, it also reports there's 319,966,433,280 bytes of free space (I've attached a screen shot to illustrate). So, what gives? Why is Windows telling me there's almost the entire 320 gigabytes free, but it's only letting me access 298? If it makes a difference, this is on Windows 7 32-bit.

Michael

Drive-capacity number confusion
I dream of the day when this stuff is obscure trivia.

Answer:
On exactly one thing, the people who insist a megabyte is a million bytes agree with the people who insist a megabyte is 1,048,576 bytes: A byte is eight bits. So byte-counts are the same for both systems.

If we use the kilo-, mega-, giga- prefixes to mean powers of ten, and kibi-, mebi-, gibi- et cetera to mean powers of two, your very-nearly-320-billion bytes adds up to 298 gibibytes, and 320 gigabytes.

A byte isn't necessarily eight bits, by the way. Early computers used bytes with several odd lengths, and there's also the concept of the "word", which is the amount of data a processor deals with at a time. Just the other day I was reading the first of Charles Sheffield's "Proteus" books, which on the first page has people in a future world when all medicine has been superseded by wholesale bodily re-engineering being taken aback by something that requires "four billion words" of computer storage.

A modern PC has, at most, 64-bit words, making four billion words only 29.8 gibibytes. Even if you say that's RAM, not drive space, it's trivially little for a future world with space colonies and people turning themselves into birds.

If we presume the future computers have some monstrous word-size like 2^24 bits, then four billion words becomes a more impressive 7,812,500 gibibytes. But drive capacity has been increasing (and cost-per-megabyte has been falling), in a pretty good exponential curve since about 1980.

If that continues, we'll have drives exceeding 7,812,500 gibibytes before 2030.

(Incidentally, this is one of the few sci-fi imagination failures that didn't snare Star Trek, at least by the time of The Next Generation. TNG and later referred to computer storage with a unit called the "quad", which remained as fuzzily defined as every other unit, technology, chemical element and subatomic particle on the show.)

Other letters columns

Give Dan some money!
(and no-one gets hurt)