Tuesday, December 20, 2011

Mac drive diagnostics: TechTools Pro and Drive Genius find problems OS X missed

I've been having suspicious application crashes lately. In the Mac world, that suggests a hardware problem. (Once upon a time software was the cause, but these days it's hardware.)

I knew from some backup issues that I had 3 unreadable files. That suggests my 24+ month old 1TB iMac drive is dying youngish. It was time for some diagnostics, so I plugged in my old Apple Keyboard and mouse. (Most non-trivial diagnostic work requires a wired keyboard and mouse; Apple's bluetooth keyboard/mouse drivers may be unavailable when needed.)

After I deleted the bad (non-critical happily) files I ran Disk Utility - but the drive passed. Then I ran my Apple Hardware Test - extended, and loop mode. Still no problems.

I didn't believe it. Something had to be wrong. So I checked out Disk Warrior, Disk Genius and TechTool Pro - 3 reputable diagnostic apps. They're all $100. Disk Warrior has a good reputation, but Disk Genius has a trial version. It found about 58 bad blocks -- out of 1.8 billion. That seems a modest number, but DG said I needed to replace the drive. (Incidentally, Disk Genius has a built in uninstall feature -- very nice. Yes, the Mac needs an OS level uninstaller.)

I decided to get a second opinion. Andy M clued me to a MacUpdate bundle, so I got TechTool Pro 6 for $50 (plust a bunch of other apps I don't care about). It one-upped Disk Genius; as it found bad blocks it told me which of them had files (none in this case).

TTP also found bad blocks - 56 (so two less than Disk Genius, but I don't make much of that either way).

I wasn't sure what to make of this. After all, 56 out of 1.8 billion is minuscule. Unfortunately, a modern SATA drive shouldn't have any bad blocks. The excellent TTP manual explains why ...
... TechTool Pro should not normally report bad blocks for these types of drives. The drive controller in them automatically tries to map out bad blocks as they are encountered. It will do this unless either the bad block is in a critical area that cannot be mapped out at the moment or the bad block table is full. If this occurs, TechTool Pro will report a bad block and you will ultimately need to do a low level reinitialization of the drive. When the drive is reinitialized, the entire platter is accessible so that bad blocks can be mapped out if possible no matter where they occur...

.. You can use Apple's Disk Utility to reinitialize your drive. Be sure to choose the Security Option to "zero out data." Choosing this option will map out bad blocks, if possible, during the reinitialization. This may take several hours (depending on the size of your drive). If the reinitialization is successful, the drive should be fine at that point. We suggest, however, that you do a Surface Scan a few times in the next month or two just to be sure no new bad blocks are developing. If they are, then the drive is probably failing and you should consider replacing it. If a low level reinitialization fails, this indicates the drive is faulty and needs to be replaced...
So the bad blocks I see now are probably a small fraction of the number that have already been mapped out. I'm seeing the overflow, including blocks that went bad after they'd been written to.

My i5 iMac is 24 months and 2 weeks old - so it's past even my AMEX extended warranty (by two weeks!). If the drive were user serviceable (like my old G5 iMac!) I'd simply replace it. Since it's not a user serviceable I'll probably bring a new drive and the machine to FirstTech in Minneapolis for a $200 24 hour turnaround replacement. I'll make a bootable clone before I do that. (My usual Carbon Copy Cloner backups are to an encrypted image for offsite transfer, so not bootable.)

If it doesn't I'll try the reinitialization. (See Update - this drive is on death row.)

Update 12/21/2011: Various notes and reflections the day after ...
  • 16 months is a short lifespan for a hard drive. I bought this machine early in its lifecycle, I wonder if there will be more failures in this product line.
  • Modern drives don't write to bad blocks. Based on the dates of the files that were involved the involved blocks went bad in the past month. That fits with Carbon Copy Cloner not complaining until recently. (See my backup issue post for a twist to this story.)
  • I'm glad I bought TechTools Pro - I think I'll get good use of it. From what I know now though, I didn't really need it. In a modern drive a single bad block in a file, especially a relatively recently written file, means replacement. Carbon Copy Cloner told me 3 files were bad.
  • The TTP manual suggests reformatting. I suspect that might work if there was an initial formatting problem, but in this case I know existing blocks are going bad. This drive is on death row.
  • Carbon Copy Cloner complained about bad sectors in files during backup, but Time Machine didn't. That may be because Time Machine only reads files that have changed?
  • Most of the bad sectors are in unused parts of the drive. I suspect they were randomly distributed but were hidden by the drive OS as they were discovered in the parts of the drive that have been used (about 500GB of 1TB).
  • Drive Utility and Hardware Test didn't find any problems, but both Drive Genius 3 and Tech Tools Pro failed the drive and Carbon Copy Cloner complained too. The SMART diagnostics still pass the drive - even today! I'm a bit surprised; this isn't rocket science. Apple could do better. TechTools Pro gives more SMART diagnostics than Disk Utility -- my drive was complaining about heat (it's cold in this room!)
  • It's clearly worth running TechTools Pro or equivalent drive scan on a new drive then every few months. (11/23/11: TTP just crashed during a routine drive scan. I'm not impressed.)
  • I think Windows scandisk/chkdsk are superior diagnostic tools to Disk Utility.
  • TechTools Pro DVD includes an image for burning a PPC DVD. Nice touch. I still have an old PPC.
  • It's good to know the drive is dying before it dies. I have time to do extra backups and to move selected files to other machines -- including my Aperture and iPhoto Libraries and perhaps iTunes.
  • When you consider that iMac 27" hard drives are NOT user serviceable, the iMac is more expensive than it seems. The iMac G5 was entirely user serviceable. Design has its price.
  • Since I know the drive is dying I've disconnected my clone backup. It's my known good repository. I'll take it to my office then create a new clone, then disconnect that clone. I won't be saving data to this machine, I'll be treating it as a "guest" machine until I get it serviced. I may turn off Time Machine too.
  • OWC (Other World Computing - great Mac shop) showed me how to find my Model Identifier (System Profiler), it's iMac11,1. I can only go to 2TB of storage. I'm not confident that their options are correct however.
See also:

2 comments:

Anonymous said...

There is a many decades old saying that 100% of electronic devices fail in their first day of use and in their last day of use. The point is that electronic devices do fail and are least likely to after the initial burn-in period. But electric surges, high operating temperatures from a failing fan, and other external factors can cause a device to fail.

Anakowi said...

Thanks. I appreciate that you detailed your actual problem, the approach you took and the result.

I've read the same marketing-spin tech reviews several times at a dozen different sites - each extoll the virtues of Disk Warrior, Drive Genius and TTP to the height of confusion. This is the first account that actually gives me a good idea of what to look for in my situation and how to tackle the problem. Thanks. Gonna buy TTP for diagnostics.