2003.11.18 02:13 "[Tiff] TIFF, v3.6.0. Fetching "custom" tags", by Ross Finlayson

The intent is that new tags encountered should auto-register themselves. I checked with photometric.tif, and sure enough TIFFGetField() didn't work. I looked into the code, and it seems that in tif_dirinfo.c 1.21 the line dp->tdir_tag = IGNORE was added in causing the newly defined tag to be ignored. This is something that I had removed in rev 1.17. Why was it re-introduced. Even in 1.17 I wasn't clear on why the code had ever been there, so perhaps there is a reason you added it?

With that line removed, the fetching of the new tag works fine. In fact it is even reported by tiffinfo (named "Tag 34665"):

Oh, yes, you are right. Now I'm recall that it worked previously.

Those change was made to fix the bug #358.

I haven't committed the change to tif_dirinfo.c yet, pending feedback on why the IGNORE setting was there.

Please, commit the change. I shall reopen the bug 358. It seems we need other solution for that problem.

About the fields, I'm trying to figure that out, also. I'm writing a tiffoj2j program and part of that is copying all of the tags of the input IFD to the output IFD, then reading and writing raw strips or tiles from the input to the output to copy the contents of the input IFD to the output IFD.

This is similar to what tiffcp does, except tiffcp only copies a subset of the tags/fields, that is hardcoded into the program, while it processes the input data to stripe and tile and change the compression. I think an improved tiffcp program would be good, sometimes tiffcp crashes.

So I went about writing a function to copy an IFD, Image File Directory. What it does is copy over the "basic" tags, for example those specifically describing the image data contents as might be used by the codec in the SetupEncode function of the codec. For example, image height, image width, tile dimensions, bits per sample and samples per pixel, fill order, planar configuration, YCbCr sampling factors, and other tags are copied from one IFD of the input to a new one of the output. I use a macro COPYTAG that defines something like:

uint32 xuint32=0;

if(TIFFGetField(input, TIFFTAG_IMAGEWIDTH, &xuint32) != 0){
TIFFSetField(output, TIFFTAG_IMAGEWIDTH, xuint32);
}

It check the return value of TIFFGetField against zero to see if the tag was set, and if it was, then it is set in the ouput file.

After copying over those tags, then the compression tag is copied over.

if(TIFFGetField(input, TIFFTAG_COMPRESSION, &xuint32) != 0){
TIFFSetField(output, TIFFTAG_IMAGECOMPRESSION, xuint32);
}

Here it is key that the xuint32 variable is actually of the type planned by _TIFFVSetField, V I think for virtual, and _TIFFVGetField, or else varargs might trash it, those are the functions called by the installed codec which has its get and set field functions called by the TIFFGetField and TIFFSetField functions.

So anyways, after the compression is copied over then there are various tags that have to do with the compression. For example for CCITTFax there are the T.4 and T.6 options, for LZW and Deflate the predictor mode, for JPEG the JPEGTables etcetera.

Then at this point the codec for the compression has had its cleanup and setup functions called in I believe both input and output. Sometimes those set fields by themselves.

Assuming this has gone well, then I try to copy all the other tags. The tags that are specified in TIFF 6.0 are pretty much each defined in a TIFFFieldInfo struct. See http://www.tiki- lounge.com/~raf/tiff/libtiff-doxygen/structTIFFFieldInfo.html, I ran doxygen on libtiff, http://www.tiki-lounge.com/~raf/tiff/libtiff- doxygen.tar.gz. So the _TIFFVGetField function is called, it checks to see if the tag is known by going through the static array of TIFFFieldInfo structures. When the input image was opened as each tag was encountered through the TIFFOpen or TIFFAdvanceDirectory function the field was set in the TIFF directory struct of the TIFF struct. The TIFFDirectory struct of the TIFF struct contains fields for each of the "predefined" "known" tags, then TIFFVSetField and TIFFVGetField just sets and reads those. So the COPYTAG macro works for those, except it requires variable declarations because of how TIFFGetField and TIFFSetField use varargs or stdargs, what have you, to pass and receive variable type and count argument lists. The argument types to TIFFGetField and TIFFSetField have to match the expected ones or the result is undefined.

Then I think there is a method in place for getting and setting the fields generically based upon the fields' definitions as TIFFieldInfo structs, which contain enough information generically for logic to store them in variable buffers instead of specific fields in the TIFFDirectory, which otherwise would have to have a specific and explicit member for each possible field. It does for the standard fields, mostly, but assuming TIFF version 875 with four hundred million standard tags, it would not be feasible to continue in that way, in the sense of storing a sparse matrix.

So I think the private tags and things like TIFFIgnoreSense have been added to being support and transition to this bettr method of getting and setting fields. It's mostly better but makes debugging more difficult because the TIFFDirectory doesn't have explicit variables to inspect.

I don't yet understand how the function is working for the "private" tags, the "general purpose specification-compatible" field handler. In the TIFF file the tag is there with its data type and count, the private tag handler can use that information to store the field's content in the TIFFDirectory structure with no other knowledge of its content.

It's pretty hairy, a complete reimplementation might make use of a different method, but the idea is to streamline within the existing libtiff an implementation that well handles the general case without undue disturbance of the pool.

Then, after copying over all the tags, then for each strip or tile, depending on whether the image is tiled, that data is segment is read in its raw (compressed) form from the input and written in its raw form to the output. That enables the libtiff logic to update the strip offsets of the strips where the output file is at a different file pointer offset than the input file, as example.

So between copying all the explicit tags and the data segments, I see a need to copy all the private tags that do not refer to offsets to data within the input that would require an extension to handle fields like the data segments. I look to the private tag mechanism and don't quite understand it yet, it definitely seems like the right idea but I haven't spent enough time on it to understand it. How do I copy the private tags from input to output, or more generally, what is the best way to copy an input IFD to an output IFD?

I attach this file tiffoj2j.c, it contains an implementation of copying the explicitly implemented tags, and I'm trying to figure out how to copy all the tags from the input to the output, to help ensure that important "private" tag information is not lost. Otherwise its use is to prototype functions to get JPEG data out of OJPEG.

Also I attach a tif_ojpeg.oj2j.diff file, I use it on the stock tif_ojpeg.c file to enable TIFFGetField and TIFFSetField on the OJPEG fields. I wrote to Scott Marovich who write tif_ojpeg.c about this, we are discussing it. The OJPEG tables are not copied over to the output file in tiffoj2j.c, they contain offsets instead of data.

If you could help describe the implementation of the general purpose field handler, it would help us to consider what ways we might change the implementation to better handle extensions without unduly affecting existing software.

The field reading, writing, getting, and setting is implemented in a variety of the source files, tiff.h, tiffio.h, tiffiop.h, tif_dir.h, tif_dir.c, tif_dirread.c, tif_dirwrite.c, the xtag contribution, tif_extension.c, etcetera.

It appears that we should examine tif_extension.c and its related modifications. Frank, please outline the tiff tag extension mechanism.

Thanks, have a nice day,

Ross F.