2008.02.18 00:40 "[Tiff] TIFFRead errors on faxes", by Mick O'Neill

2008.02.18 06:40 "RE: [Tiff] TIFFRead errors on faxes", by Mick O'Neill

Thanks Frank - I think I've discovered exactly where the error is in the TIFF file - the StripByteCount value is actually an offset to the end of the strip (so the strip byte count would be this value - StripOffset + 1).

Now, my next question is going to be, should there be a mod in libtiff to handle this, or do I need to do it outside the library. Inside the library itself, it is easy to determine that this is occurring - if the StripOffset + StripByteCount overlaps another area of memory that has been mapped by another directory entry or its data, then it has occurred. Doing this outside of the library is not as simple, but also feasible, of course.

The product that is generating these files, Faxman, has been around for quite a while, and is reasonably popular. In particular, we have integrated their SDK with our ECM, so for us, the problem is compounded. I need to develop the solution, as we plus our clients will have literally tens of thousands of these files, which need to be rendered to PDF for archival and/or publishing.

Just after an opinion at which way I should attack this - is the solution WANTED as a part of libtiff, or should we just handle it in our own software?


Mick O'Neill

-----Original Message-----
From: Frank Warmerdam [mailto:warmerdam@pobox.com]
Sent: Monday, 18 February 2008 1:15 PM
To: Mick O'Neill

Once again, I have some TIFF files that cannot be read. These TIFFs

> have come in through the Faxman faxing software, and when I try to > read them through the 3.9.0 beta library, Page 2 reports a "Read error

> on line

xxx: Expected bbb, got ccc" error. Brava Reader and Autovue can view these images ok.

Sample tiff files available from http://midimick.com/temp/fax1.tif and http://midimick.com/temp/fax2.tif

I just looked at fax1.tif. The file is 68445 bytes in length. The first directory (page) looks fine. The second is:

Directory 1: offset 68247 (0x10a97) next 0 (0)
SubFileType (254) LONG (4) 1<0>
ImageWidth (256) SHORT (3) 1<1728>
ImageLength (257) SHORT (3) 1<2287>
BitsPerSample (258) SHORT (3) 1<1>
Compression (259) SHORT (3) 1<3>
Photometric (262) SHORT (3) 1<0>
FillOrder (266) SHORT (3) 1<2>
StripOffsets (273) LONG (4) 1<30797>
SamplesPerPixel (277) SHORT (3) 1<1>
RowsPerStrip (278) SHORT (3) 1<2287>
StripByteCounts (279) LONG (4) 1<68197>
XResolution (282) RATIONAL (5) 1<200>
YResolution (283) RATIONAL (5) 1<192>
Group3Options (292) LONG (4) 1<4>
ResolutionUnit (296) SHORT (3) 1<2>
Software (305) ASCII (2) 34<ImageMan by Data Techniq ...>

Note that the StripOffsets is 30767 and the strip size is 68197. This would only be reasonable if the file was at least 68187+30767 bytes in length, but in fact it is much shorter than that.

My conclusion is that the file is corrupt and was incorrectly constructed.

I suspect other packages are more forgiving of this problem and successfully decode the second page from the available data while it seems that libtiff is promptly giving up on the second page due to the io error.

I observe that tiff2rgba does handle the first page just fine.

BTW, packages that only operate on the first page will likely be fine.


and watch the world go round - Rush    | President OSGeo, http://osgeo.org