AWARE [SYSTEMS]
AWare Systems, , Home TIFF and LibTiff Mailing List Archive

LibTiff Mailing List

TIFF and LibTiff Mailing List Archive
October 1999

Previous Thread
Next Thread

Previous by Thread
Next by Thread

Previous by Date
Next by Date

Contact

The TIFF Mailing List Homepage
Archive maintained by AWare Systems



New Datamatrix section



Valid HTML 4.01!



Thread

1999.10.04 17:05 "libtiff problems with Group 4", by Joel Schumacher
1999.10.05 09:13 "Re: libtiff problems with Group 4", by Frank D Cringle
1999.10.06 22:33 "Re: libtiff problems with Group 4", by Kevin D Quitt
1999.10.07 10:41 "Re: libtiff problems with Group 4", by Frank D Cringle
1999.10.07 15:02 "Re: libtiff problems with Group 4", by Frank D Cringle

1999.10.04 17:05 "libtiff problems with Group 4", by Joel Schumacher

Dear Sam Leffler,

We currently have an imaging project to scan and store documents as
TIFF files.  To make a long story short, we're trying to OCR them and
our OCR company says there's problems with some of our TIFF images
and so does tiffinfo.

I turned on DEBUG mode and recompiled tiffinfo to get it to show exactly
what it's decoding, then ran it with the -D option on a problem file:

0000001F/9: V0         0        1
0000000F/8: V0         0        1
00000007/7: V0         0        1
00002003/14: V0         0       1
00001001/13: V0         0       1
00000800/12: EOL        0       0000000
Fax4Decode: 169439.tif: Bad code word at scanline 3291 (x 0).
Fax4Decode: Warning, 169439.tif: Premature EOL at scanline 3291 (got 0, expected 2544).
00000010/13: VL         2       000010
00000000/7: EOL        0        0000000
Fax4Decode: 169439.tif: Bad code word at scanline 3292 (x 2542).
Fax4Decode: Warning, 169439.tif: Premature EOL at scanline 3292 (got 2542, expected 2544).
00000008/8: Pass       0        0001
00000000/12: EOL        0       0000000
Fax4Decode: Warning, 169439.tif: Premature EOL at scanline 3294 (got 0, expected 2544).
00000000/7: EOL        0        0000000
Fax4Decode: Warning, 169439.tif: Premature EOF at scanline 3295 (x 0).
Fax4Decode: Warning, 169439.tif: Premature EOL at scanline 3295 (got 0, expected 2544).

At the point where it's failing (last 16 bytes of the data), we see             
this:                                                                           
                                                                                
21fb0 - 21fbf : FF FF FF FF FF FF FF FF FF FF FF F0 01 00 10 00                 
                                                                                
The problem seems to be at the F0 01 00 10.  In binary, that's                  
                                                                                
    1111 0000 0000 0001 0000 0000 0001 0000

So the run of 1's are V0 codes, then you get to the 000000000001
000000000001 where it fails.

This is an EOFB (end of facsimile block) codeword.  It is a 24-bit
codeword defined as follows:
   000000000001000000000001

As you can see, it's not interpreting this properly.  It sees 7 0's
in a row and reports it as an invalid codeword.

The EOFB is defined in section 2.4.1.1 of Recommendation T.6 and is
also described on page 52 of the TIFF 6.0 spec.

So, it would seem we're encountering an end of page before we've                
decoded the 3300 lines that RowsPerStrip and ImageLength defined.
Looks like we've done 3290 and ran across an EOFB.  This is not
neccesarily a problem as the TIFF 6.0 spec states on page 53:                   
                                                                                
   If a TIFF reader encounters EOFB before the expected number of               
   lines has been extracted, it is appropriate to assume that the               
   missing rows consist entirely of white pixels.  Cautious readers             
   might produce an unobtrusive warning if such an EOFB is followed             
   by anything other than pad bits.                                             
                                                                                
   Readers that successfully decode the RowsPerStrip (or TileLength             
   or residual ImageLength) number of lines are not required to                 
   verify that an EOFB follows.  That is, it is generally appropriate           
   to stop decoding when the expected lines are decoded or the EOFB             
   is detected, whichever occurs first.  Whether error indications or           
   warnings are also appropriate depends upon the application and               
   whether more precise troubleshooting of encoding deviations is               
   important.

Although it doesn't seem like you're even recognizing the EOFB, much
less having a problem with it ending before the appropriate number
of lines were decoded.

Can you verify that this is indeed a problem with libtiff?

And do you also interpret the EOFB coming early as something that
shouldn't be a problem and should be handled?
______________________________________________________________________
Joel Schumacher                    JCPenney Co. - UNIX Network Systems
jschumac@uns-dv1.jcpenney.com      12700 Park Central Pl   M/S 6021
(972) 591-7543                     Dallas TX  75251