2021.12.16 09:01 "[Tiff] Ensuring buffer alignment when creating tiff files", by Milian Wolff

2021.12.16 20:23 "Re: [Tiff] Ensuring buffer alignment when creating tiff files", by Milian Wolff

So, do these two things mean that a patch would be accepted upstream to make libtiff write the buffers in an aligned fashion? How would that API look like? For our purpose a minimal non-optional API that will always ensure the strip buffer offsets are BitsPerSample-aligned would be enough. Would that be acceptable upstream? Or does it have to be user configurable? If so, could you please give me a rough outline of your expectations, then I will work on this and prepare a patch.

In my opinion, if this is done, it should be done by default by libtiff without any extra API exposure. Files written by libtiff would then be a bit larger.

Great, then I'll look into this and prepare a patch accordingly.

If there is more than one file, then there will be another copy. If there are multiple applications reading from the same file at approximately the same time, then mmap will provide a win if the system memory is large enough.

This is unrelated to the issue at hand, but it piques my interest nevertheless: Can you please expand on this "another copy"? We have some files A, B, C, and then each has potentially multiple image buffers, say A1, A2,... Then we create read-only mmaps for these buffers, say M_A1, M_A2,..., M_B1,... On access we'll trigger a page fault that will make the kernel load the data from the disk to fill the page mapping. But there's only going to be one single "copy" of a given file page here in this scenario. Where is the "other copy"?

Doing a munmap() only removes the memory mapping from the current process address space. It does not remove anything from system memory.

Sure, but that isn't another copy. If at all, it means the now-useless data remains in memory a bit longer. Furthermore, once memory pressure rises, these pages can easily be repurposed which is what happens nicely on all systems. And this works - for readonly pages - even without the munmap, since the mapping can be recreated as needed later.

Linux and macOS copes will without the madvise in our case. Eviction rates are also never an issue with what we are throwing at the system. It's only Windows which is problematic in some cases due to its bad eviction of dirty pages. But clean read-only mapped segments from tiff files or other formats that are directly mmappable is mostly fine in our experience.

That is good if it is your experience. In the past I have seen bad things happen (e.g. process uses hardly any CPU and sleeps much of the time) when using mmap to read many files quickly. The page eviction is often done by a single-threaded kernel thread which needs to consult information maintained by the MMU and use that data to decide which pages to retire based on how long it has been since they were last used, and based on current memory pressure.

We've been testing our software on mid/low end laptops sporting 2 CPU cores that are ~7 years old up to relatively modern Ryzen CPUs with 12 cores. I have not seen any contention issues yet, so either it's simply modern hardware and software being faster, or the effect is small enough to not be noticeable in the grand scheme of things for us.

Cheers

Milian Wolff | milian.wolff@kdab.com | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts