Sunday, April 23, 2006

Why PNG sucks: It's the metadata, stupid

I've been experimenting with various file formats to use with iPhoto and RAW images. iPhoto generated TIFF files completely uncompressed and thus quite large, iPhoto doesn't provide JPG fine tuning (lossy compression control), JPEG2000 is still not widely used, etc. PNG seemed a good option -- reasonable sizes, well specified, decent color tables, etc.

But PNG has a lethal flaw, as documented by the Library of Congress digital preservation project:
PNG

The PNG specification allows labeled text (ASCII or UTF-8) elements to be embedded in text chunks and predefines a few standard keywords(element labels): Title, Author, Description, Copyright, Creation Time, Software, Disclaimer, Warning, Source, Comment. The compilers of this resource are not able to assess the degree to which such metadata is found in practice or whether other keywords are in common use. An attempt in 2000 to develop open source tools to convert EXIF images (including EXIF metadata) to PNG seems to have been abandoned. See
http://pmt.sourceforge.net/exif/drafts/d020.html. Without such tools and agreed practices, PNG can not rank highly for self-documentation.
In other words, there are no standards, or even exif-like pseudostandards, for embedding metadata (time image acquired, etc) in PNG images. Obviously there are no standards in common use for associating a portable metadata XML file (for example) in a bundle with a PNG image.

So 'save as PNG' means toss out the metadata. PNG is worthless as an image archival format. Shame.

BTW, this LOC site is an exception reference. In contrast to the PNG tragicomedy, note the discussions of JPF (still in the twilight zone) metadata ("self-documentation"). There's some very sophisticated thinking there, unfortunately no-one supports this yet ...
All JPEG 2000 files are made up of "boxes," as described in the Notes below, including an XML box typically used for metadata. Regarding JPX_FF, Annex N of Part 2 of the specification provides detail about metadata and offers but does not require a specification based on DIG35 elements. This metadata specification includes four broad metadata categories: (1) image creation ("how," e.g., about the camera), (2) content description ("who," "what," "when," and "where"), (3) history ("how the image got to its present state," i.e., provenance metadata in the digital library lexicon), and (4) intellectual property rights (IPR) metadata (which may be used in conjunction with technological protection systems). Additional boxes inherited from JP2_FF include one for a unique identifier for the image or identifier-references to other digital objects, e.g., a UUID, and another for IPR metadata, possibly redundant with that included in the XML box.
JPF wraps JPEG2000 compression formats, which are proprietary and have lots of IP issues. Contrast now to Adobe's open DNG (described as a "subtype of TIFF 6"):
See TIFF_6. Additional metadata may be embedded in a file using tags from (1) TIFF/EP or EXIF_2_2 (see also TIFF_UNC_EXIF), (2) IPTC (TIFF tag 33723), and (3) XMP (TIFF tag 700). Regarding TIFF/EP and EXIF, the specification states that TIFF/EP stores the tags in IFD 0, while TIFF_UNC_EXIF stores them in a separate IFD. Either location is allowed but the EXIF location is preferred. Proprietary metadata that may be used by camera manufacturer's raw convertors is to placed under private tags, in private IFDs (Image File Directories), and/or a private MakerNote. (pp. 12-13)
TIFF 6 appears to be the preferred format for image archiving by the Library of Congress. I can't tell what version of TIFF iPhoto produces. It's not compressed, but that doesn't say much. It does contain EXIF metadata, so maybe it's TIFF_UNC_EXIF.

On review I'd say we need iPhoto to export to DNG with compression and use DNG as a native lossless format.

5 comments:

Anonymous said...

Actually, XMP metadata can be embedded in PNG. MetadataTouch supports it.

Anonymous said...

I've also been burned by PNG. Out of all of the file formats I've used (.avi, Quicktime, JPEG, BMP, PNG), PNG is the only one that has gone corrupt on me. I lost an entire folder of images yesterday. Thankfully it was backed up, because if it wasn't, those frames would have been missing from my film.

HUFFyuv also let me down. I no longer use either.

Anonymous said...

PNG is the worst format for photo archival... period!

Anonymous said...

This guy clearly doesn't know how PNG works.

Robert Frunzke said...

Even back in 2006, you were able to embed XMP and IPTC data in PNG files. Just as in JPEG and TIFF and others. The embedding mechanisms are similar.

So, what do you think: does PNG still suck?