2012.10.17 12:39 "[Tiff] Documentation of tag 700 (XMP)", by Gary McGath

2012.10.17 13:27 "Re: [Tiff] Documentation of tag 700 (XMP)", by Joris Van Damme

Gary,

2012/10/17 Gary McGath <developer@mcgath.com>:

The documentation of TIFF tag 700 (XMP) at (http://www.awaresystems.be/imaging/tiff/tifftags/xmp.html) says that the field type is BYTE. However, Adobe's XMP specification, part 3, Table 12, says: "The field type should be UNDEFINED (7) or BYTE (1)."

Using Aware's information led to a bug in JHOVE that falsely rejects quite a lot of valid TIFF files that use type 7; this will be fixed in the next release now that it's been called to my attention. (JHOVE is, by design and intent, a nitpicky curmudgeon.)

Could whoever is responsible for the Aware site fix the information; or if there's something that overrides the Adobe spec I'm looking at, could someone call my attention to it? Thanks either way.

Thank you for bringing this to our attention. I'll fix the information.

I'd go as far as saying that UNDEFINED probably is the single only really correct datatype. It is to be used for data with mixed application-defined types. For all practical purposes, BYTE boils down to similar handling in TIFF codecs, but conceptually, it is less correct. Any real-world decoder should be built to allow the BYTE datatype though, for any tag data that is conceptually UNDEFINED, because otherwise it would reject a vast majority of what is perfectly sensible 'out there'.

This being said, the information on the AWare Systems site is absolutely not intended to be taken as a specification. Actually, the TIFF specification itself can hardly be taken as a specification, being both terribly outdated and a sort-off enumeration of classes that are specific combinations of properties, rather then viewing TIFF as being *any* sensible combination of independent properties. If the TIFF specification itself was to be interpreted as both strict and exaustive, the vast majority of real-world applications of TIFF could not be viewed as legit, and we'd loose substantial functionality.

So, in the end, it's actually common sense and good understanding of "the spirit of TIFF" that should be viewed as specification, rather then "the letter of TIFF". Alternatively, one could try and escape some of this mess by pointing out that the TIFF specification hardly even tries to specify anything at all, rather its intent is to clarify by giving typical application examples. But still, that point of view hardly fills any void at all when push comes to shove.

So, no, I can't point you to any definite specifications, for anything TIFF related. The best I could do would lead you to a document that tries to establish old-style JPEG compression in TIFF and doesn't even mention that this is overriden and has been for two decades or so. The override is only apparent if by accident you stumble upon a web page that says so... This may seem unrelated, but it's just a typical example of the negligence that is key here, and goes to show no "specification" in this field can be taken to override common sense.

Which is why I've always been a bit sceptical of JHOVE, really. The best TIFF authorities, on this mailing lists, sometimes argue amongst themselves about what is legit TIFF, and they build on quite some experience and understanding of the concepts. So an automated tool that rules over TIFFs being legit or not, does seem ambitious, even if I do realize that JHOVE as is can be useful, and could for example serve in pointing out best practices for good interchange. Then again, if that is the intent, it should probably be build to allow the BYTE type for any tag that also allows the UNDEFINED type, if it is to have any real-world purpose.

Best regards,

Joris Van Damme
AWare Systems

info@awaresystems.be
http://www.awaresystems.be/