Image file formats

Friday 4 November 2005

A conversation with a friend yesterday turned to photography. He said he wanted to gather statistics from the EXIF data in his photos. I said I didn’t know how EXIF data was stored, and would have to go look it up. David responded,

That’s because you’re Ned.

which I can’t argue with.

Turns out EXIF data is actually stored as a TIFF file embedded in a JPEG record!

I’m probably more interested in file formats than your average guy. I’m fascinated by the different choices made by the designers of these formats. For example, JPEG is a sequentially-read record-oriented format. The file is composed of chunks, each of which has a tag number and a length. PNG files are also record oriented, but the records are identified by a four-character id. In a clever hack, four bits of record metadata are stored in the 0x10 bit of each character, so the case of the letters in the tag are significant in interesting ways. Typical tags include tIME, pHYs, and bKGD.

TIFF files are a bit harder to pick apart: they include byte offset pointers within the file, so reading the file may involve jumping to the end only to be directed back toward the beginning to find the data you want.

David and I also discussed RAW files. We both knew they were a straight capture from the CCD, and that there’d been no compression loss or “developing” interpretation, but couldn’t put our finger on exactly how that differed from typical image files.

The downside of RAW files is that different cameras use slightly different formats. Adobe has a format called DNG, Digital Negative, which is designed to retain all the benefits of RAW files, but without all the vendor-specific differences.


...and your camera can store a thumbnail as a JPEG inside the TIFF data inside the JPEG block...

Since you are so interested in file formats, I thought you'd like to learn more about RAW. David Coffin has done most of the hard work involved in reading different RAW formats and created a free utility called dcraw. Most of the imaging software companies have started their decoders from his work. Anyways, take a look for yourself.


Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.