2018.01.15 19:17 "[Tiff] Strategies for multi-core speedups", by Larry Gritz
I have applications in which the I/O for TIFF files dominate the runtime. The big expense is the decompression, not the raw read/write. Almost always there are cores to burn, but I'm finding it hard to figure out how to leverage that with libtiff.
As a point of comparison, the OpenEXR library libIlmImf has API calls that will read multiple adjacent scanlines or tiles at once. The library maintains a thread pool, so those multi-scanline reads or writes actually will compress/decompress the scanlines/strips/tiles in parallel. This dramatically speeds up reads and writes of whole files when using the multi-scanline/tile API (though obviously does not come into play when the app reads or writes a single scanline or tile at once).
I tried to do this myself using libtiff -- for the common case of needing to read an entire image, my aim was read all the raw strips (TIFFReadRawStrip, serially), then dole them out to different threads to decompress in parallel (and when writing, compress in parallel, then TIFFWriteRawStrip each one serially). If libIlmImf is any indication, doing this on our typical 12 or 16 core machines ought to speed up TIFF I/O by an order of magnitude, easily.
But my plan was thwarted because it sure seems that the API for the codecs is inherently stateful and non-reentrant.
Has anybody tried something like this? Any advice on how to compress/decompress multiple strips or tiles simultaneously? Is there some workaround, or perhaps I just don't understand the libtiff internals enough to see the solution?