2017.05.22 19:34 "[Tiff] Support for fsync", by Roger Leigh

2017.05.22 19:34 "[Tiff] Support for fsync", by Roger Leigh

Hi folks,

While doing some profiling and performance testing for our libtiff-using application, I realised that we couldn't accurately profile the write times because it never issues a full flush of the data to disc when you call TIFFClose. Unless you use TIFFFdOpen, you have no way to get at the open fd to issue the fsync. And doing this cross-platform means doing all that for Windows as well. This meant we could have several gigabytes of pending buffered IO which never contribute to the execution time.

This also has important implications for data integrity; it would be nice to have the option to checkpoint the data. While TIFFlush[Data]() looks like it should fit the bill, and does write out the data, it *doesn't* (and likely shouldn't) also do an fsync/fdatasync to ensure that the data reached stable storage for performance reasons. TIFFFlush is more like fflush(3) than fsync(2), in that it submits the IO requests, but it doesn't make any guarantees that they actually completed.

Would it be worth considering the addition of a new function e.g. TIFFSync[Data] which would simply call fsync(2) or its Windows equivalent on the open file? It could potentially also call TIFFlush internally to be sure it's committing a consistent state. This would mean I can issue TIFFSync() prior to TIFFClose, or TIFFSyncData as I write out each IFD, and be sure the the data writes completed when the call returns (or failed).

Any other thoughts?

Roger