AWARE [SYSTEMS] Imaging expertise for the Delphi developer
AWare Systems, Imaging expertise for the Delphi developer, Home TIFF and LibTiff Mailing List Archive

LibTiff Mailing List

TIFF and LibTiff Mailing List Archive
December 2009

Previous Thread
Next Thread

Previous by Thread
Next by Thread

Previous by Date
Next by Date

Contact

The TIFF Mailing List Homepage
This list is run by Frank Warmerdam
Archive maintained by AWare Systems



Valid HTML 4.01!



Thread

2009.12.22 06:41 "PATCH: proposed changes to some JPEG routines", by Lee Howard
2009.12.24 02:11 "Re: PATCH: proposed changes to some JPEG routines", by Lee Howard

2009.12.22 06:41 "PATCH: proposed changes to some JPEG routines", by Lee Howard

Hello everyone,

As you may recall from a month ago or so I have been doing some work 
with an application that will be using libtiff's JPEG-in-TIFF support.  
The application is HylaFAX.  Color fax image data is compressed as 
JPEG.  In order to put a multi-page fax into one file (as well as to 
make use of valuable data-space in TIFF headers), HylaFAX is wrapping 
the JPEG images with TIFF.

Unfortunately, there are some obstacles (libtiff bugs) in this purpose 
which has required me to seek out and fix the problems - at least for my 
purpose.  I would like this message to start a conversation about 
getting these bugs fixed (and ultimately concluding with that) because I 
do not want to continually distribute a patched libtiff in order my 
users to be able to use these new features.

The primary bugs are these:

1)  When generating a JPEG-compressed TIFF file and writing the JPEG 
data into the TIFF with TIFFWriteRawStrip libtiff automatically 
generates a JPEGTables TIFF tag without being requested by the 
application and said JPEGTables TIFF tag contains no valid data, just 
2000 null bytes.

2)  It does not seem possible for the application to produce the 
JPEGTables tag independent from libtiff.  In other words, libtiff must 
use libjpeg to compress the bitmap data into JPEG in order for a valid 
JPEGTables tag to be produced.  The application cannot suggest a 
JPEGTables tag to libtiff.

Consequently the invalid JPEGTables tag makes it impossible for the 
viewer applications to render the image properly... and there does not 
seem to be a workaround other than to decompress the JPEG first and then 
to recompress it into the TIFF via libtiff (which would introduce some 
unnecessary loss).

3)  When running a JPEG-compressed TIFF file through the tiff2pdf tool 
(which basically just changes the TIFF wrapper for a PDF wrapper since 
PDF can have JPEG-compressed image data in it), the resulting PDF file 
is not viewable in Acrobat Reader, evince, or Ghostscript, although xpdf 
does handle it fine.


I've attached a patch file against libtiff 4.0.0beta5.  By no means is 
this patch file a suggestion of what should be committed to the code 
repository.  Instead, I attach the patch for your reference.  It shows 
the changes that I made in order to accomplish what I needed.  I added 
comments as I felt appropriate for understanding.  I would like to 
discuss my changes more verbosely here, now.  I would appreciate any 
comments that you may have.  I hope that the end-result will be 
sufficient changes to fix the three bugs I've listed above.

In tif_jpeg.c you'll notice two places where I've suggested that some 
sections of code should not apply to RGB photometric data.  As I ended 
up using YCbCr photometric data these points are not as critical to me, 
but I'm pointing them out to you now in order to simply raise my concern 
that something looks incorrect there.

At the end of tif_jpeg.c is the most-critical change that I made in 
order to resolve my point #1 above (which made my point #2 
unimportant).  I'm not sure if TIFFSetFieldBit(tif, FIELD_JPEGTABLES) 
actually needs to be called at this point.  I suspect that this field 
bit will get set later if it is used by the application or by the 
library, and so I think that during this memory-allocation process at 
the outset of the TIFF creation it is not important to set the field 
bit.  In any case, removing this one line solves the majority of my grief.

I found tiff2pdf.c to be extremely well-documented in parts, and wholly 
lacking of documentation in other parts.  In particular, the code 
doesn't provide human-readable information that details what it's doing 
as it does it.  So in tiff2pdf.c you'll notice that I've added some 
comments to help readers know what JPEG headers are being parsed by the 
code at the indicated points.  I think this kind of documentation is good.

The solution to my point #3 above is to write the JPEG SOI headers into 
the TIFF strip data rather than to skip them.  Those two lines of code 
solve problem #3.  Then the PDF can be viewed in all PDF readers I have 
tested.

My last change in tiff2pdf.c is to question the correctness of a change 
that was made to the code in libtiff 3.8.1 by Dan Cobra.  However, since 
I've switched from RGB photometric to YCbCr this code is no longer a 
problem for me, and, indeed, it's possible that before I settled on 
YCbCr data that my RGB data with which I was testing was incorrect.  
However, it may be a good idea for someone to double-check the sanity of 
that change made in 3.8.1.

I look forward to your feedback.

Thanks,

Lee.

diff -Nru tiff-4.0.0beta5/libtiff/tif_jpeg.c tiff-4.0.0beta5.new/libtiff/tif_jpeg.c
--- tiff-4.0.0beta5/libtiff/tif_jpeg.c	2009-07-07 19:46:29.000000000 -0700
+++ tiff-4.0.0beta5.new/libtiff/tif_jpeg.c	2009-12-19 16:06:06.000000000 -0800
@@ -1082,6 +1082,11 @@
 		return (0);
 	}
 #endif
+/*
+Sampling factors are not relevant for PHOTOMETRIC_RGB.  So if the
+RGB JPEG image has confusing sampling factors we can ignore it.
+*/
+if (sp->photometric != PHOTOMETRIC_RGB) {
 	if (td->td_planarconfig == PLANARCONFIG_CONTIG) {
 		/* Component 0 should have expected sampling factors */
 		if (sp->cinfo.d.comp_info[0].h_samp_factor != sp->h_sampling ||
@@ -1147,6 +1152,7 @@
 			return (0);
 		}
 	}
+}
 	downsampled_output = FALSE;
 	if (td->td_planarconfig == PLANARCONFIG_CONTIG &&
 	    sp->photometric == PHOTOMETRIC_YCBCR &&
@@ -1158,7 +1164,12 @@
 			/* Suppress colorspace handling */
 		sp->cinfo.d.jpeg_color_space = JCS_UNKNOWN;
 		sp->cinfo.d.out_color_space = JCS_UNKNOWN;
+/*
+Sampling factors are not relevant for PHOTOMETRIC_RGB.  So if the
+RGB JPEG image has confusing sampling factors we can ignore it.
+*/
 		if (td->td_planarconfig == PLANARCONFIG_CONTIG &&
+(sp->photometric != PHOTOMETRIC_RGB) &&
 		    (sp->h_sampling != 1 || sp->v_sampling != 1))
 			downsampled_output = TRUE;
 		/* XXX what about up-sampling? */
@@ -2259,7 +2270,15 @@
         if( tif->tif_diroff == 0 )
         {
 #define SIZE_OF_JPEGTABLES 2000
+/*
+The following line assumes incorrectly that all JPEG-in-TIFF files will have
+a JPEGTABLES tag generated and causes null-filled JPEGTABLES tags to be written
+when the JPEG data is placed with TIFFWriteRawStrip.  The field bit should be 
+set, anyway, later when actual JPEGTABLES header is generated, so removing it 
+here hopefully is harmless.
+
             TIFFSetFieldBit(tif, FIELD_JPEGTABLES);
+*/
             sp->jpegtables_length = SIZE_OF_JPEGTABLES;
             sp->jpegtables = (void *) _TIFFmalloc(sp->jpegtables_length);
 	    _TIFFmemset(sp->jpegtables, 0, SIZE_OF_JPEGTABLES);
diff -Nru tiff-4.0.0beta5/tools/tiff2pdf.c tiff-4.0.0beta5.new/tools/tiff2pdf.c
--- tiff-4.0.0beta5/tools/tiff2pdf.c	2009-01-29 13:58:26.000000000 -0800
+++ tiff-4.0.0beta5.new/tools/tiff2pdf.c	2009-12-19 22:31:37.000000000 -0800
@@ -3320,13 +3320,16 @@
 	int j=0;
 	
 	i++;
-	
 	while(i<(*striplength)){
 		switch( strip[i] ){
 			case 0xd8:
+				/* SOI - start of image */
+				_TIFFmemcpy(&(buffer[*bufferoffset]), &(strip[i-1]), 2);
+				*bufferoffset+=2;
 				i+=2;
 				break;
 			case 0xc0:
+				/* SOF - start of frame */
 			case 0xc1:
 			case 0xc3:
 			case 0xc9:
@@ -3365,12 +3368,15 @@
 				}
 				break;
 			case 0xc4:
+				/* DHT - define huffman tables */
 			case 0xdb:
+				/* DQT - define quantization tables */
 				_TIFFmemcpy(&(buffer[*bufferoffset]), &(strip[i-1]), strip[i+2]+2);
 				*bufferoffset+=strip[i+2]+2;
 				i+=strip[i+2]+2;
 				break;
 			case 0xda:
+				/* SOS - start of scan */
 				if(no==0){
 					_TIFFmemcpy(&(buffer[*bufferoffset]), &(strip[i-1]), strip[i+2]+2);
 					*bufferoffset+=strip[i+2]+2;
@@ -5074,10 +5080,16 @@
 		case T2P_COMPRESS_JPEG:
 			written += t2pWriteFile(output, (tdata_t) "/DCTDecode ", 11);
 
+/*
+This code which was added in 3.8.1 breaks at least some PHOTOMETRIC_RGB JPEG-in-TIFF images...
+My guess is that the need condition isn't based purely on the photometric values.
+Dan Cobra - from whom this originated - should probably be consulted.
+
 			if(t2p->tiff_photometric != PHOTOMETRIC_YCBCR) {
 				written += t2pWriteFile(output, (tdata_t) "/DecodeParms ", 13);
 				written += t2pWriteFile(output, (tdata_t) "<< /ColorTransform 0 >>\n", 24);
 			}
+*/
 			break;
 #endif
 #ifdef ZIP_SUPPORT