Wednesday, May 25, 2005

Obscure OS X Tiger upgrade bug: runaway process related to automated file defragmentation?

MacInTouch Home Page

This may occur specifically with upgrades from older systems that may date back to 10.1. Some OS crashes may also trigger it. Sounds complex:
Several people followed up on a strange problem with "runaway" processes in Mac OS X:

[MacInTouch Reader] I had the same problem with the unkillable "update" process regularly hijacking the entire processor on my iMac G3 400MHz after upgrading to Tiger over a Panther installation. It would occur every 12 hours or so and could only be stopped with a forced reboot. This discussion at Apple put me on the right track to solving the problem. Specifically, checking the system log in Console revealed repeated errors such as:

"hotfiles_evict: err 28 relocating file 27611"

suggesting that the system was getting "stuck" while relocating a certain file. The discusion also suggested using hfsdebug to identify the file and delete it manually.

Unfortunately on my iMac, hfsdebug was unable to identify the file while the "update" process was running, so I had to record the file number from the error message, then run hfsdebug in the Terminal after force-rebooting the iMac. I was then able to locate and manually trash the offending file in the Finder.

This went on for a few days, with the "update" process reappearing every 12 hours or so, getting stuck on a new file. I noticed that the stuck file identification numbers were increasing (about a dozen files between 14931 and 112976), and that all of the files were quite old, created around 2001 or 2002. (I had not performed a clean install since the Public Beta days.)

Well after a few days of catching the "update" process in the act and manually deleting the problem files, I can happily report that the problem has disappeared and my ageing iMac is good as new!

["Xratchy"] The "update" problem has nothing to do with "System Update". The update program will have added lines to your system.log. Check them out for

hotfiles_evict: err 28 relocating file 20

The meaning for the error number can be found in

/System/Library/Frameworks/Kernel.framework/Versions/A/Headers/sys/errno.h
#define ENOSPC 28
/* No space left on device */

It's a bug in the kernel in the disk optimizing (defragment-hotfiles) code, where the opitomizing process doesn't find a free contiguous-block of disk space.

It's triggered in this case when there is no more continuous free block available, where the process needs/expect one, then it will just retry the search and brings your system to a halt.
The optimizing process is also triggered by opening a file, not only on reading/writing.

Try to make sure to have 10% disk space free at all times (if it's error 28). (Virtual memory can eat a lot of disk space.) Check your disk for errors after the crash (with your Tiger disk - Panther doesn't find/fix them all).

No comments: