2000.01.29 01:15 "tiff2ps: G4 compression in PostScript]", by Paul Kaiser

2000.01.29 18:27 "Re: tiff2ps: G4 compression in PostScript]", by Helge Blischke

I have been investigating the G4 compression issue intensely for the past week.

Adobe Acrobat will create a PDF using G4 compression for monochrome images if so requested. Then, if you export that PDF as PostScript, the image still retains G4 compression and uses the /CCITTFaxDecode filter in the PostScript code.

I am therefore using such a PDF exported as PS to compare. The PDF/PS was created from a bitmap TIF, and the same bitmap TIF with G4 compression is then compared.

When I compare the image data in the PS file to the actual image data from the G4 compressed TIF, they are VERY similar. However: -- Every now and then there is extra data in the TIF file compared to the data in the PS file.

I cannot figure out what extra data is written in the TIF file as compared to the bitmap data written in the PS file. The G4 specs from ITU specify that there may be some padding at the End of the transmission, but that doesn't seem to be the problem, as the observed differences are not all at the end of the file.

Anyway, If I figure anything out, I'll let the list know. I hope we can crack this one soon; it is very important to me!

Paul,

the differences between the Acrobat generated G4 compressed data and the original G4 TIFF data are easily explained:

1st, TIFF images are often tiled or made up in strips, where each tile or strip is compressed separately (as the TIFF spec requires). An Acrobat compressed image is much like a TIFF image made up of a single strip.

2nd, by default Acrobat does not use the uncompressed flag (i.e. there are no scanlines that are not compressed, even if compression will bloat this part of the image.

3rd, by default, Acrobat does not aligne scanlines to byte boundaries.

4th, Acrobat generates an EOFB sequence at the end of the image data. Though the TIFF spec does require this as well, I've encountered many G4 compressed TIFF images which lack this EOFB - which normally does not matter, as a reader is not required to theck for this sequence, but is required to stop decoding if the required number of scanlines (rows) is decoded or the EOFB is found (whichever occurs first).

Besides that, the T6 specification is somewhat unclear on the conditions where the horizontal encoding mode should change.

Helge

H.Blischke@srz-berlin.de
H.Blischke@srz-berlin.com
H.Blischke@acm.org