2017.06.28 15:22 "Re: [Tiff] Excessive memory allocation while chopping strips", by Olivier Paquet

2017-06-28 10:36 GMT-04:00 Even Rouault <even.rouault@spatialys.com>:

It's not that simple (should ideally have a few pair of eyes go over it).

Reviewing is always welcome. I'd be open to a more formal review process if that's needed but it would imply people would volunteer reviewing proposed patches in due time. My feeling is that the libtiff development community has not lately been sufficiently large and active for that.

Indeed. I barely have time to keep up with the mailing list myself. And we have yet to formally move to git, something which should have been done years ago.

It made reading the large strips a little slower.

Agreed. Although potential RAM recopying due to realloc() must be significantly faster than the typical I/O acquisition time.

True. I have code which rereads directories from memory but I don't think it will hit your changes. In any case, I was just pointing out the non triviality of fixes.

We could short-circuit all this non-trivial progressive allocation and reading logic in the case where we know the file size. I wanted to have something that would work without knowing the file size, but I agree my use cases where getting the file size is a slow operation are not common. As you suggest below a TIFFOpen() flag for something else, I'm wondering if we couldn't have a TIFFOpen() flag that would mean "calling TIFFGetFileSize() is a costly operation on this file handle, please do not do this unless strictly required".

I think it best to do it in fewer different ways. If your new code ends up used only in rare cases where getting file size is expensive, bugs in it are less likely to be found. It's already bad enough if it is not used in the mmap case, which I think is the default. Even though your use case is not common, that is no reason to make it even more difficult.

It does not help with compressed strips, as I understand it.

Hum, it should apply both to uncompressed or compressed strips/tiles AFAIR. And this changeset doesn't apply to the mmap'ed case since that one knows the file size and already validates the strip size against it.

I think you misunderstood me. I meant the problem that uncompressed strip size can be quite large and, as far as I know, can't be validated without decompressing the data. A hard limit on strip size would cover that neatly and sane applications will keep it reasonable anyway.

Most of the time it seems people use the TIFF utilities (tiffcp, tiffsplit, tiff2pdf, etc...) for the fuzzing. Most of the bugs are in the utilities themselves.

Then having them default to accept only "sane" files should avoid a lot of false reports. People who actually hit a crappy file could always override that.

I didn't dare suggesting setting stricter limits in libtiff by default (with a flag to retain current behaviour as you suggest), but that would certainly be a better protection than opt-in mechanisms. I'm certainly +1 for that.

If we can't have both "security" and "reading crappy files from some obscure software from the 90s", I think it's reasonable to pick the former as a default. I'd like to hear other people's thoughts on that.

Olivier