2003.11.17 00:45 "[Tiff] TIFF, v3.6.0. Fetching "custom" tags", by Chris Losinger

2003.11.18 06:59 "[Tiff] TIFF, v3.6.0. Fetching "custom" tags", by Ross Finlayson

Hi,

I figure I should look at the source code here and see what is happening with the tags and fields and see how they work.

The primary access point to getting and setting fields for a directory of a TIFF are TIFFGetField and TIFFSetField as prototyped in the header file tiffio.h and implemented in tif_dir.c.

The TIFFGetField function calls TIFFVGetField. Maybe V is for variable, for varargs. The comment for TIFFVGetField says "like TIFFGetField, but taking a varargs parameter list". The TIFFGetVField function calls a function through a function pointer that is an element of the TIFFTagMethods struct that is an element of the TIFF struct. The

TIFFTagMethods struct contains three function pointers, one is prototyped as a TIFFVGetMethod and another as a TIFFVSetMethod.

So, when TIFFGetField is called, then a varargs list is instantiated, the varargs is a "compiler facility" or something, then TIFFVGetField is called. Then, in TIFFVGetField, a TIFFFieldInfo struct for the tag that is requested is gotten using _TIFFFindFieldInfo. Then, where that is not null, ie the TIFFFieldInfo is found, then if either it's a codec pseudo-tag or the TIFFFieldSet function returns non-zero then the directory contains those field contents as something has called TIFFSetFieldBit for that field or class/category of fields. Then, the get field function pointer from the TIFFTagMethods struct is called with the arguments of the TIFF*, the tag, and the va_list from TIFFVGetField.

The TIFFVGetMethod is installed in the TIFFTagMethods member of the TIFF struct when the codec is setup. The encoder is setup when the image is opened when the compression tag is set with TIFFSetField(tif, TIFFTAG_COMPRESSION, ...). So now the TIFFGetField function has called TIFFVGetField, which is calling the xx_TIFFGetField function of the codec, as defined in tif_jpeg.c, tif_lzw.c etcetera.

Then, each codec in its implementation of that TIFFVGetMethod, which returns an int, each codec either handles the method, fills the va_list, and returns, or returns the result of calling the TIFFVGetMethod that is an element of the codec state struct that is installed in the codec setup.

So at this point, TIFFGetField was called and the codec has either handled the field or the TIFFVGetMethod. Aha, the TIFFVGetMethod is not installed in the state struct on codec setup, but rather in TIFFInitxx for the codec xx, eg TIFFInitJPEG. So what happened when the codec was installed was it installed the original TIFFVGetMethod from the TIFFTagMethods struct of the TIFF struct to the codec state struct, eg JPEGState, and the TIFFVGetMethod of the codec was installed in the TIFFTagMethods of the TIFF struct, pointer.

So now the codec's implementation of a TIFFVGetMethod is calling the function prototype that was originally installed in the TIFF struct by TIFFDefaultDirectory, in tif_open.c, which was initially called through TIFFOpen->TIFFFdOpen->TIFFClientOpen. Also called in TIFFDefaultDirectory() is a function _TIFFSetupFieldInfo. Also in

TIFFDefaultDirectory is a reference to the _TIFFextender, which has to do with the tag extensions.

The default function for getting the field is implemented as _TIFFVGetField. In _TIFFVGetField is where there is a big switch statement for many of the tags, and that is where the va_list is populated with values that are stored explicitly in the TIFFDirectory struct. Mostly, the variable argument is set to the value of the variable as it is stored in the TIFFDirectory, it's important to call TIFFGetField with the correct argument types. At the default label of the switch, then there is the function to handle the tags that aren't explicitly enumerated in the switch. This function could presumably handle all fields except there have been over time various specifications and usages of the fields in terms of their variables that require special handling.

Once again the _TIFFFindFieldInfo method is called on the tag type to get the TIFFFieldInfo struct. There is enough information in the TIFFFieldInfo struct for most purposes of knowing the valid type and count of elements of the field, mostly.

Newt lines from Aliens: "They mostly come at night,... mostly." "Ahhh!" "Aiiieeee!" "Rip-ley!"

So anyways here in _TIFFVGetField now there is a const TIFFFieldInfo* named fip. It is non-null because _TIFFFindFieldInfo already returned non-null back in TIFFVGetField. In the function here it is checked or null and also that the fip's field_bit member is set to FIELD_CUSTOM, or else an error is reported and zero is returned. It seems that the field_bit member of the TIFFFieldInfo of the struct returned from _TIFFFindFieldInfo is expected to be FIELD_CUSTOM or else it is not handled. Where it is, then the function continues...

There is a for loop that increments from zero to one less than the td_customValueCount member of the TIFFDirectory struct. This is presumably set in TIFFSetField, which is of course directly related to TIFFGetField. So anyways forr each then it is referencing a TIFFTagValue struct that is stored in the td_customValues member, that is presumably of type TIFFTagValue*, of the TIFFDirectory struct. When it finds a TIFFTagValue that has a field_tag member that is equal to the tag that was an argument to TIFFGetField way back when, then _TIFFVGetField accomplishes one of two things: if the TIFFTagValue's field_passcount is set, then a short value is set in the va_arg as the length of the data, and then the second of the variably-many arguments is set to TIFFTagValue's value member. Either that, or if the TIFFFieldInfo's field_type is TIFF_ASCII then the argument is set to a char pointer of the TIFFTagValue's value member. If not that, then it returns saying "TIFFGetField... pass by value not imp."

There doesn't seem to be implementation of things like the extension tag being a plain old uint32 or something.

So it looks like the tag extension logic is in where the customValue is set, and to do with TIFFTagValue.

On the flip side is TIFFSetField. TIFFSetField calls TIFFVSetField. TIFFVSetField checks to see if the tag is OK to change with the OkToChange function, which basically checks if the mode is to write and the directory is in the middle of being written and changing it would upset the codec or data write functions. Then it is calling the TIFF's TIFFTagMethods.vsetfield, the codec's, which is then often calling the _TIFFVSetField function. I think the extender functions have added TIFFFieldInfo struct that are returned with _TIFFFindFieldInfo.

The _TIFFVSetField sets any explicitly stored variable in the TIFFDirectory, and then for others the default label of the switch looks in the customValues for the TIFFTagValue and sets its value if it is already there or grows the pointer through _TIFFrealloc.

I think it is key to examine _TIFFFindFieldInfo. I look at the list on globals.html of the doxygen output and see _TIFFFindFieldInfo is defined in tif_dir.h and implemented in tif_dirinfo.c. I think I've found an error here, the _TIFFFindOrRegisterInfo function is defined in tif_dir.h, but it's implemented as _TIFFFindOrRegisterFieldInfo in tif_dirinfo.c.

The _TIFFFindFieldInfo goes through the TIFFFieldInfo* member of the TIFF struct called tif_fieldinfo, tif_nfields many of them. Then, here in tif_dirinfo.c are other functions to consider.

The tif_fieldinfo struct is initialized with _TIFFSetFieldInfo function which merges each of the element of the static fieldInfo array using _TIFFMergeFieldInfo. Then, presumably the codec merges the TIFFFieldInfo pointers of the codec. Then, TIFFFindOrRegisterFieldInfo installs private field info structs.

The TIFFFieldInfo struct is as so:

http://www.tiki-lounge.com/~raf/tiff/libtiff- doxygen/structTIFFFieldInfo.html

from tiffio.h:

typedef struct {
        ttag_t  field_tag;              /* field's tag */
       short   field_readcount;        /* read count/TIFF_VARIABLE/TIFF_SPP */
 short   field_writecount;       /* write count/TIFF_VARIABLE */
 TIFFDataType field_type;        /* type of associated data */
         unsigned short field_bit; /* bit in fieldsset bit vector */
       unsigned char field_oktochange; /* if true, can change while writing */
 unsigned char field_passcount;  /* if true, pass dir count on set */
    char    *field_name;            /* ASCII name */
} TIFFFieldInfo;

typedef struct _TIFFTagValue {
     const TIFFFieldInfo  *info;
     int             count;
     void           *value;
} TIFFTagValue;

There is also there in tiffio.h the TIFFMergeFieldInfo function.

There is much to consider, whether ir not the specific field or a category of fields (eg image dimensions) is set in the image is checked and set variously in the logic.

Maybe we could introduce a union pointer of TIFF data types to the TIFFTagValue, switch around the TIFFDataType, etcetera.

There's quite a bit more to the field logic, I'll try and help figure it out.

I fixed up the tiffoj2j.c file a little bit.

Ross F.