2017.02.06 20:43 "[Tiff] Qs about support for more than 2^16 IFDs and writing performance", by Dinesh Iyer

2017.02.07 00:04 "Re: [Tiff] Qs about support for more than 2^16 IFDs and writing performance", by Bob Friesenhahn

I am developing an application that uses libTIFF for reading/writing TIFF files and I have the following questions about libTIFF. I apologize in advance if these questions have been asked but I could not find answers to them in the archive.

  1. I noticed that libTIFF does not support reading images from TIFF files as IFD's greater than 65535. The TIFF specification does not appear to impose such a limit but libTIFF does. I have a few clients who have 100,000 image TIFF stacks that they would like to be able to read.
  2. I noticed that libTIFF does appear to support writing > 65536 images into a TIFF file. Is this the expected behaviour or an over-sight?

It is likely that the arbitrary limit is intended to foil looping IDFs (denial of service opportunity) while also limiting the performance and memory impact of the check. The linear scan is perhaps not the fastest algorithm but it should not take terribly long (much less than a second) to scan 65535 entries.

  1. Lastly, when i tested writing a large number of images into a TIFF file, I found that the performance was very slow. It took ~2.5 hrs to write 66200 images, each of size just 64x64. I came across this github post: https://github.com/escabe/libtiff/commit/58b4c3ba4478987ecfe1e793b9d925e59eecfa36
  2. about why the performance is poor and a possible fix for this. Is this patch recommended?

It looks like this patch also caches the IFD offset lists in the writer but also improves lookup time for the case where the offset searched for is the one just stored.

Perhaps an algorithm like a Bloom filter (https://en.wikipedia.org/wiki/Bloom_filter) could be used to improve efficiency, but I am not sure of how much code is required for an implementation. The Bloom filter can have false positives so an exhaustive scan would sometimes be needed to eliminate false positives.

  1. Are there plans to limit the 65536 IFD restriction for TIFF files and also improve the writing performance anytime soon? Or is the recommendation to patch my local copy of the TIFF library?

The 65536 IFD restriction is likely arbitrary.

The most important thing is that have you profiled the code for your slow test case to see why it is so slow to write? Are you able to post simple portable code for your test case so that others can observe the same issue that you are?

It is quite possible that the writing bottleneck is something other than this part of the code.

Bob
--
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/