2019.04.24 13:20 "Re: [Tiff] TIFFWriteScanLine - buffers to RAM before flushing to disc?", by Bob Friesenhahn

and setup a buffer to hold a single scanline which I reuse:

uchar *buf = (uchar *)_TIFFmalloc(scanlineSize);

I then loop over each row in my OpenCV matrix, and copy a line to that buffer (the "line" is ~150k pixels wide) and in this case "rows" is much smaller, say 2500.

for (size_t y = 0; y < myOpenCV.rows; y++) {
memcpy(buf, myOpenCV.data + myOpenCV.step * y, scanlineSize);
TIFFWriteScanline(composite_tiff, buf, y, 0);
}

Why do you allocate a buffer using _TIFFmalloc() and use memcpy() if you could just pass the address (myOpenCV.data + myOpenCV.step * y) as 'buf' to TIFFWriteScanline()?

What strip size is being used for your TIFF? Are you enabling any TIFF predictors which might require re-organizing the data prior to it being written?

You said that compression is not being used. If compression is enabled, then buffering of the compressed data may be required, and some compressors might suffer from arbitrary limits.

If I put a breakpoint before the loop and after the loop, I see the memory grow with each call to TIFFWriteScanline and then before TIFFClose(), the file as shown in Windows still lists as 0kb in size.

How are you observing/measuring this memory growth?

TIFFClose() can take several seconds, and then the file in Windows shows the expected size and the memory drops all the way back.

In my limited experience, Microsoft Windows is not very good at reporting the properties of files which are currently being written.

If LibTIFF is memory mapping the file, that would support what I'm seeing. Is there a way to turn that off?

It only (optionally) uses memory mapping when reading from a file. The reason for this is that reading a TIFF file may require a lot of random access and the memory-mapping makes that random access appear almost "free". However, simply memory mapping a file does not bring any data into memory. Accessing memory corresponding to a portion of the mapped file which has not been accessed yet causes the kernel to trap (stopping the program), a kernel driver reads memory pages (e.g. 4k at a time) from the file, and then the program continues running.

Bob
--
Bob Friesenhahn
bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/

Public Key, http://www.simplesystems.org/users/bfriesen/public-key.txt