AWARE [SYSTEMS] Imaging expertise for the Delphi developer
AWare Systems, Imaging expertise for the Delphi developer, Home TIFF and LibTiff Mailing List Archive

LibTiff Mailing List

TIFF and LibTiff Mailing List Archive
November 2007

Previous Thread
Next Thread

Previous by Thread
Next by Thread

Previous by Date
Next by Date

Contact

The TIFF Mailing List Homepage
This list is run by Frank Warmerdam
Archive maintained by AWare Systems



Valid HTML 4.01!



2007.11.02 20:26 "Recent bug fixes in CVS head", by Frank Warmerdam

Folks,

I've made some non-trivial bug fixes in libtiff CVS head that I'm a bit
nervous about, so I'm reporting them here.  Basically, there was a bug
reported with GDAL updating a predictor=2 based LZW file "in place".
In working through this I found a number of problems.   Some core problems
with the predictor stuff not working in mixed read/write access on a file.
But also problems with the rewrite tiles in place logic (easy to
corrupt the file), and a weird issue in tif_dirwrite.c with BEENWRITING
being set, but if the last action was a read then _rawcc was nonzero
and the code would append this read-buffer data under the assumption it
was an unflushed write buffer.

The changelog entries look like this:

2007-11-02  Frank Warmerdam  <warmerdam@pobox.com>

	* tif_write.c: Rip out the fancy logic in TIFFAppendToStrip() for
	establishing if an existing tile can be rewritten to the same location
	by comparing the current size to all the other blocks in the same
	directory.  This is dangerous in many situations and can easily
	corrupt a file.  (observed in esoteric GDAL situation that's hard to
	document).  This change involves leaving the stripbytecount[] values
	unaltered till TIFFAppendToStrip().  Now we only write a block back
	to the same location it used to be at if the new data is the same
	size or smaller - otherwise we move it to the end of file.

	* tif_dirwrite.c: Try to avoid writing out a full readbuffer of tile
	data when writing the directory just because we have BEENWRITING at
	some point in the past.  This was causing odd junk to be written out
	in a tile of data when a single tile had an interleaving of reading
	and writing with reading last.  (highlighted by gdal
	autotest/gcore/tif_write.py test 7.

	* tif_predict.c: use working buffer in PredictorEncodeTile to avoid
	modifying callers buffer.
	http://trac.osgeo.org/gdal/ticket/1965

	* tif_predict.c/h: more fixes related to last item, keeping a
	distinct pfunc for encode and decode cases as these were getting
	mixed up sometimes.
	http://trac.osgeo.org/gdal/ticket/1948

2007-11-01  Frank Warmerdam  <warmerdam@pobox.com>

	* tif_predict.c/h, tif_lzw.c, tif_zip.c: Improvements so that
	predictor based encoding and decoding works in read-write update
	mode properly.
	http://trac.osgeo.org/gdal/ticket/1948


Andrey, I think you wrote the "can we write a tile back to it's old
location" logic in TIFFAppendToStrip() that looked like this:

	if (td->td_stripoffset[strip] == 0 || tif->tif_curoff == 0) {
		/*
		 * No current offset, set the current strip.
		 */
		assert(td->td_nstrips > 0);
		if (td->td_stripoffset[strip] != 0) {
			/*
			 * Prevent overlapping of the data chunks. We need
			 * this to enable in place updating of the compressed
			 * images. Larger blocks will be moved at the end of
			 * the file without any optimization of the spare
			 * space, so such scheme is not too much effective.
			 */
			if (td->td_stripbytecountsorted) {
				if (strip == td->td_nstrips - 1
				    || td->td_stripoffset[strip + 1] <
				    td->td_stripoffset[strip] + cc) {
					td->td_stripoffset[strip] =
					    TIFFSeekFile(tif,0, SEEK_END);
					td->td_stripbytecountsorted = 0;
				}
			} else {
				uint32 i;
				for (i = 0; i < td->td_nstrips; i++) {
					if (td->td_stripoffset[i] >
					    td->td_stripoffset[strip]
					    && td->td_stripoffset[i] <
					    td->td_stripoffset[strip] + cc) {
						td->td_stripoffset[strip] =
						    TIFFSeekFile(tif, 0, SEEK_END);
					}
				}
			}

			if (!SeekOK(tif, td->td_stripoffset[strip])) {
				TIFFErrorExt(tif->tif_clientdata, module,
				    "Seek error at scanline %lu",
				    (unsigned long)tif->tif_row);
				return (0);
			}
		} else
			td->td_stripoffset[strip] = TIFFSeekFile(tif, 0, SEEK_END);
		tif->tif_curoff = td->td_stripoffset[strip];
	}

Based on problems I was experiencing, and an inspection of the code this
seems like very risky logic.  In particular, the collision checking only
checks against other blocks from the same image - not against directory
contents, or blocks of other images in the same file.  Also, the
"compare to every other block extent" logic could become prohibabitively
expensive in large files with many strips/tiles.  So I stripped this out
and replaced it with:

	if (td->td_stripoffset[strip] == 0 || tif->tif_curoff == 0) {
             assert(td->td_nstrips > 0);

             if( td->td_stripbytecount[strip] != 0
                 && td->td_stripoffset[strip] != 0
                 && td->td_stripbytecount[strip] >= (uint64) cc )
             {
                 /*
                  * There is already tile data on disk, and the new tile
                  * data we have to will fit in the same space.  The only
                  * aspect of this that is risky is that there could be
                  * more data to append to this strip before we are done
                  * depending on how we are getting called.
                  */
                 if (!SeekOK(tif, td->td_stripoffset[strip])) {
                     TIFFErrorExt(tif->tif_clientdata, module,
                                  "Seek error at scanline %lu",
                                  (unsigned long)tif->tif_row);
                     return (0);
                 }
             }
             else
             {
                 /*
                  * Seek to end of file, and set that as our location to
                  * write this strip.
                  */
                 td->td_stripoffset[strip] = TIFFSeekFile(tif, 0, SEEK_END);
             }

             tif->tif_curoff = td->td_stripoffset[strip];

             /*
              * We are starting a fresh strip/tile, so set the size to zero.
              */
             td->td_stripbytecount[strip] = 0;
	}

This logic requires that stripbytecount[] comes into TIFFAppendToStrip()
unaltered (it was previously being reset to zero in TIFFWriteEncodedStrip/Tile)
and only rewrites a tile in the same location if the new tile data is the
same size or smaller.  I'm pretty sure this is how this code worked at one
point years ago, yet you saw some reason to substantially restructure it.
Can you recall why you did, and whether you see a problem with the approach
I have implemented.

So far I have only made the changes in libtiff CVS HEAD (libtiff4), but I
think some/most of the changes will need to go back into libtiff 3.9 branch
before the 3.9.0 release.  I'm hoping for some feedback and testing before then
though.

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | President OSGeo, http://osgeo.org