2000.11.15 19:19 "Has something changed with error detection in group4 data?", by Randall Myers
Hello TIFF folks,
I have a problem I hope you can help with.
I recently began integrating libtiff 3.5.5 into a product suite that has been around for a several years under Unix and NT. The previous production library was 3.4beta029, and I've used a variety of earlier ones as well. Testing 3.5.5 revealed an apparent tolerance for errors in group4 data that I had not seen before.
I occasionally run across corrupt group4 (single-strip) TIFF images in the input stream, and I have scripted a simple screening procedure that employs a slightly modified tiffinfo utility with the -D switch (to read the data). If tiffinfo has a problem decoding the data, there will be a call to the error/warning reporting function, and a non-zero exit code enables the script to take some action.
But tiffinfo built with the 3.5.5 library doesn't seem to notice all of the errors that are detected when I use an older library. In the worst case, the 3.5.5 library will decode a group4 image with no complaint at all when older libraries will issue errors and warnings and when Wang/Kodak Imaging will flatly refuse to deal with the image.
I have built all of the versions of libtiff I could find at ftp.onshore.com and experimented with each of them. I have to go back to 3.4beta035 to get the behavior I am used to seeing (and, FWIW, 3.5.4, 3.5.3, and 3.5.2 just dump core when I read one of these corrupted images). Beginning with the 3.4beta036 I see fewer errors reported (and occasionally, as mentioned, NO error reporting where other readers do find errors). I have built vanilla libtiff under NT and two different architectures running Unix. I have avoided compiler optimizations. All that seems to matter is the revision of the library.
Documentation for 3.4beta036 says these things which might be relevant:
> a bug was fixed in the G3/G4 decoder for data where lines terminate with a v0 code
> the routines for reading image data now provide more useful information when a read error is encountered
So far I am unable to supply example images that demonstrate the worst case (no errors detected when they clearly exist), because all examples I've found contain readable confidential content. However, I have one partial example for which there is no such issue, and it is published at http://www.echolake.com/example/example1.tif. Group4 decode errors are reported for this image by both older and current libraries, but the errors occur in different places within the data; and the old libraries report many more than the newer libraries.
In the past I have been able to correlate the behavior (with regard to corrupt images) of "tiffinfo -D" and that of the readers which our customers use (Microsoft Imaging). Is this a correlation I should be able to rely upon? I have found other viewers which will display these images, but it's not because they succeed in decoding them correctly where libtiff does not. These images, past a certain point, are just garbage.
I am daunted by the g4 codec and would greatly appreciate any help or suggestions you might offer.
Data General Imaging Development