Wednesday, May 06, 2009

Retrospect restore failing, network flaky – a hardware problem?

Maybe it’s the incipient dementia, but I’m having a hard time telling hardware problems from software problems these days.

It didn’t used to be this way. Even ten years ago if something went wrong, it was almost always a software problem.  The only exception was the slowly dying drive, but you could usually hear that going.

Now, who can say? Systems can run hot, solder isn’t what it used to be, and quality is an issue everywhere. Software is very complex now, and software changes can make latent hardware issues into active problems.

It’s also true that hardware is much older than it used to be. Moore’s Law failed a while back; my 6 yo XP box just keeps on being useful. I don’t ask much of it since we’re largely an OS X shop, but it’s good enough for basic work.

It all adds up. Oh, and the dementia too. Being an OS X shop means it’s been a very long time since I’ve had to think about BIOS age, memory maps, interrupts, and the like.

My latest experience is a case in point. It began when I replaced my old USB backup drive and enclosure with a LaCie 1TB drive/enclosure. My old XP box wouldn’t boot! It simply hung in early startup. I found I had to turn the drive off to boot, then turn it on again when XP was up. Then it all worked.

Ominous.

Next I started getting oddball network problems. I beat them back and things seemed to settle down, but then a Retrospect Professional restore of a 50GB iTunes Library failed with a typically cryptic Retrospect error code of "-519". I had to throttle my 100 gbps network back to 10 mbps to get the restore to work.

That got my attention. I can’t live with unreliable backup/restore.

In some earlier testing I’d eliminated cabling and my Netgear gigabit switch as contributors. So the problem lay in my 3 yo G5 iMac or my 6 yo XP box. Neither had had major software changes recently, so I bet on hardware. Since some network glitches had required power cycling the XP box I put my bet on that.

So I bought the Intel PWLA8391GT PRO/1000 GT PCI Network Adapter. It came from Amazon in about 2 days (free shipping!) in a plain package with a single DVD. Nothing fancy here.

So I swapped out the old 100 mbps SMC NIC for the Intel and rebooted and got a … turquoise screen.

Nothing. The drives were spinning, the CPU fan was spinning, but the system locked pre-BIOS! I pulled the card, restarted and things looked good.

So then in desperation, I moved the NIC to a different slot, and then rebooted and looked through all my BIOS settings. I made one change. The BIOS had previously been set to manage devices, now I set it to ignore PnP devices and let the OS handle them.

So I did two things at once – but I wasn’t trying to identify the root cause. I wanted the thing to work.

I then restarted with the LaCie 1TB drive attached and … it worked.

I’m really getting tired of figuring this stuff out.

The Intel adapter requires drivers, so I installed from the CD and … wait for it …. found a bug.

Immediately.

It’s a gift.

The installer bombed with a poorly written complaint about my “S:” drive.

Turns out I’d mapped the “My Documents” folder to a (now inaccessible) network share that I’d mapped to the “S:” drive. So of course there was nothing there. Even when I dismounted the “S:” drive, the installer still bombed. I had to reset “My Documents” to the default setting.

So, pretty dumb coding on the installer. On the other hand, once the install completed, I was impressed by the diagnostics suite. The NIC, cabling and network passed every test.

These hardware diagnostic tests are critical in the modern era, so this utility was a definite plus.

I then repeated the restore that had previously failed at the 200MB mark. This time it went easily past 1GB, with a throughput of about 830 MB/min (probably limited by the USB drive).

So I think my problem is solved. Was it really a problem with the IDE slot? Or the old NIC? Or the 1TB USD peripheral causing some problem with the 8 yo BIOS? Was the fix the new card, changing the BIOS settings, or moving to a new slot?

I don’t know.

Don’t care.

My network’s much faster now…

No comments: