Monday, June 02, 2008

Leopard (10.5) Sparse Bundle .IMG files are packages, not files

This is surprising. I will have to test with Retrospect and see how it treats these things. I assume Retrospect will see these IMG objects as Packages, and thus back up only the changed "band".

mac.column.ted: Leopard still holds some small surprises - MacFixIt

...So why was the bundle image format added in Leopard? Because there was a significant problem with plain sparse images. A sparse image is essentially a single file. When backing up your drive, a backup utility thus sees the image as a single file, regardless of how many files are stored within the image. Further, any addition or subtraction you make to the image (such as adding even a measly 5K text document) registers the image as a modified file. This means that, if an image file were 1GB in size, the entire 1GB would need to be recopied to a backup each time the image was modified, even if the only change to the image was a 5K file addition. Not very efficient. And unnecessarily time consuming.

The sparse bundle format avoids this dilemma. Essentially, the bundle format divides the content of the image file into smaller separable bands. The image still appears as a single file in the Finder. However, it is actually a package. If you select Show Package Contents from the image's contextual menu in the Finder, you will find a bands folder containing the individual band segments (as shown in the figure below). Each band, at least in my testing, was 8MB or less. Assuming your backup software recognizes and works correctly with the bundle format, only the modified bands are copied over when backing up the image. This means that backing up the aforementioned 1GB image, with a 5K file addition, would require copying only 8MB or less!...

...Apple, in Disk Utility's Help pages, recommends using the sparse bundle format whenever you want to create "a blank disk image for storage." Indeed, Apple takes its own advice and uses the new format for FileVault (rather than the sparse image format used by FileVault in Tiger)...

The division of a .IMG file into arbitrary packaged Bands is a clever mitigation of a problem that's had many variations over the years.

Update 3/9/09: I looked into these as a way to share an iPhoto Library between multiple users. It looks like Retrospect Pro does NOT backup .sparsebundle images correctly. Yech.

Update 5/6/09: Hoisted from comments (DocIceT):

Re your attempt to make sparsebundles work with Retrospect, I had some partial success.

Firstly the backup needs to include what Retrospect sees as top level directory of the bundle. Finder shows this as the name of the package file.

More interestingly, the restore works if it is done to the original drive. If the bundle gets restored to a different drive then the bundle is not seen as a mountable file system any more.

With that said, there is some kind of permissions change going on when restoring to a different drive and I had to tweak that manually. This could break some part of the OS X structure for making those bundles work.

3 comments:

Dave Walker said...

Indeed -- bundles, as a rule, rock.

DocIceT said...

Re your attempt to make sparsebundles work with Retrospect, I had some partial success.

Firstly the backup needs to include what Retrospect sees as top level directory of the bundle. Finder shows this as the name of the package file.

More interestingly, the restore works if it is done to the original drive. If the bundle gets restored to a different drive then the bundle is not seen as a mountable file system any more.

With that said, there is some kind of permissions change going on when restoring to a different drive and I had to tweak that manually. This could break some part of the OS X structure for making those bundles work.

Mirko said...

Does anyone know how the data is organized technically? It would be, to my understanding of things, sub-optimal to really create hundreds of 8MB files physically scattered throughout the disk. The performance of reading and writing to and from the image would be bad, I think. So I guess there is something going on that takes for contiguous physical data layout, as it is done e.g. on video DVDs, where the UDF filesystem shows you four 1GB files, which are physically just one contiguous stream of data when read byte-by-byte from the raw block device.
Do you know whether sparsebundles are implemented like that in Leopard or Snow Leopard?