2018.06.29 10:08 "[Tiff] Writing tiles out of order", by Roger Leigh

2018.06.29 11:27 "Re: [Tiff] Writing tiles out of order", by Roger Leigh

On 29/06/18 11:27, Even Rouault wrote:

On vendredi 29 juin 2018 11:08:00 CEST Roger Leigh wrote:

Right now I'm using libtiff to write tiled images completely in order: each tile of each plane in ascending order, adding new IFDs as needed. One of my users has asked if it's possible to stripe writes across directories. That is, for example, to write Tile 0 of IFDs 0-12, then Tile 1 of IFDs 0-12 etc.

Looking at the API, it's possible to visit previously written IFDs, but you have to call TIFFRewriteIFD, which gets expensive to duplicate when you have 10k-100k tiles per image and will be rewriting that many times over. If I reopen an old IFD and then call TIFFWrite*Tile, should I expect this to work, and update the existing IFD tile offsets+sizes in place, or is this strictly forbidden?

Update in place, and switching between IFD, works. GDAL does this in some scenarios. But as you pointed, if your IFD have many tiles/strips, loading and rewriting the offset and bytecount arrays each time you switch from IFD can be time expensive.

Just to clarify, this is only time [I/O] expensive when reading and updating the tile offsets+sizes?

Or is it also wasteful of space by duplicating the whole IFD each time I do an update?

Looking at GDAL, I see you calling TIFFRewriteDirectory in frmts/gtiff/geotiff.cpp. If you're only writing additional tiles, would it be sufficient to call TIFFCheckpointDirectory to update the already written directory in place? Or will this only work before the first call to TIFFWriteDirectory? Or would TIFFWriteDirectory be possible to use here? (Apologies for the confusion, this seems a bit of a non-standard thing to do and it's not covered by the documentation.)

If performance is terrible, what you could potentially do is create the structure of the TIFF (ie all IFDs but without writing any tile/strip), close it, and then re-open it in update mode as many times as you have IFDs, and use a given TIF handle for a given IFD. Multiple handles in update on the same file is probably to be considered undefined behaviour, but I believe (untested though !) that given the current implementation of libtiff, that should work, since it uses TIFFSeekFile(tif, 0, SEEK_END) to get the offset of a new tile/ strip (assuming the implementation of seek() you use actually seeks to the file, and doesn't cache the file size). Of course, do not write in a multi- treaded way...

Thanks, I'll investigate this too. I've already wrapped libtiff in a muxtex so should be relatively safe... On the other hand, for large images with multiple planes for Z/T/C, I might run out of file descriptors!

Thanks,

Roger