2005.11.15 09:38 "[Tiff] libtiff support for Microsoft Office Document Imaging", by Brad Hards

2005.11.15 09:38 "[Tiff] libtiff support for Microsoft Office Document Imaging", by Brad Hards

G'day all,

Per a previous post, I've been looking at the Microsoft Office Document Imaging (.mdi) file format. It is heavily based on TIFF, with the following changes:

  1. Different magic number
  2. Additional tags, mostly non-trivial blobs
  3. Additional compression formats.

The integration into libtiff is almost sure to be messy, because so much of it needs to be reverse engineered. So I'm seeking some advice on the best approach to this, with minimal changes to the library.

I understand that I can add tag-specific handling from my code using TIFFMergeFieldInfo(), with additional parsing of those blobs in the application.

I understand that I can add additional compression formats using TIFFRegisterCODEC().

So the question is really about the magic number. I think it is bad style to add the MDI magic number without actually supporting MDI properly. So for tiffdump, I'm thinking of adding two more command line options (one that turns off the magic number check, and the other that turns on byte swapping)

For the library proper, I'd like to add another option to the client open mode array that disables the magic number check, and to extend the meaning of the l and b mode to be "if we are opening and the magic number check is disabled, use big or little (respectively) endian to interpret the file".

How does this look? Suggestions? Thoughts?

Brad