2003.11.24 10:02 "[Tiff] tiff2pdf contribution", by Finlayson, Ross
On Friday, November 21, 2003, at 01:25 PM, Leonard Rosenthol wrote:
If you could explain the /Rotate field that would be good. I think there should be different rotate flags for view and print.
It tells a PDF renderer what degree to rotate the content. Some renderers due this as a pre-transform on the contents, so do a post -rasterization rotation, etc. Doesn't matter - but it DOES effect all rendering as per the spec.
If you want something different for view vs. print, then you'd want to avoid /Rotate and use PDF 1.5 optional content and CTM's.
That PDF 1.5 sure is a thick spec. I mean, I''ve never seen an airplane manual or the collected IRS code or anything but the PDF spec is more than a thousand pages, well rendered in PDF. Each of PDF 1.3, 1.4, and 1.5 were I thought pretty expansive additions, even moreso with all that XML business.
I started using a CTM, I thought I could multiply a PDF rectangle by
cos t sin t 0
-sin t cos t 0
0 0 1
but it rotated off of the page, the media box. I tried setting the translation, too, and it wouldn't display in the result. I wrote matrix multiply and copy functions, and they worked great, but I have not enough knowledge of their use to accomplish what I wanted. So, I just started setting the CTM for the q [a b c d e f] cm /XObject Do Q operator in the content stream and found some that worked for a transformation from the 1x1 rectangle at 0, 0 to fill the image dimensions. That tokenizes to q, [a b c d e f] cm, /Xobject Do, Q, the q operator stores the CTM and sets a new one, the [a b c d e f] concatenates the matrix to the new CTM, and Do operator executes the content of the named XObject, and the Q operator restores the initial CTM. To mirror the images I gave them negative dimensions.
The /Rotate would be sufficient for rotating the page, but there are also the four more options to flip or mirror the page.
Not as far as PDF is concerned. /Rotate is there to simply allow people (or software) to fix landscape pages that were done as portrait (or vice versa).
Those functions are handled by CTM...
I went to a book to get some information about matrix multiplication, although it doesn't specifically talk about 2-D transforms using 3x3 matrix multiplication, "Fundamentals of Matrix Computations", 2'nd ed., David Watkins. It told me that the product of two 3x3 matrices A and M is a 3x3 matrix B where b_ij = sum a_ik * m_kj, for i, j, k, from one to three.
I got to implementing it, and was frustrated by my lack of knowledge and understanding. Thus, I started sampling values and seeing which ones would be effective. I was able to quickly characterize and categorize, with the help of the PDF 1.5 spec., TIFF 6.0, and the matrix book, the various transforms I thought I required, simply setting elements of the matrix
a b 0
c d 0
e f 1
where a is the initial horizontal scaling, d vertical scaling, and e and f the x and y coordinates of the translation.
Then, I had to figure out the tiles. With the image box for a single image, at least, it's at a corner at the origin, so it's easy to determine from its dimensions its width for using the negative width for horizontal flipping of the image. With tile's, they're all over the place. So, what I did, I translated each tile to its location within the larger image cornered at the origin, then I changed the sign of the x or y components and then translated them back into the first quadrant. That was good for TIFF orientation cases 1-4, for 5-8 I also swapped the x and y components. At this point I would almost consider uninvertible matrices. Anyways, once I determined the location and extents of the tiles, where the tiles on the right and bottom edge of the tile set of the unoriented image might have lesser dimensions than the tile dimensions, I passed the x and y coordinates x1, y1 and x2, y2, to the function to orient an image segment in that rectangle. It works greater except I can't figure out why the edge tiles have problems when the x and y coordinates were swapped. Maybe I should do that after I translate the coordinates back into the first quadrant, I try that now. It does not work, my fleeting hopes are crushed.
I've implemented it but it doesn't work right on tiled images with TIFF orientation > 4. I have here a tiled JPEG that has orientation 4 and each row is flipped. I'm wondering in what ways people implemented TIFF orientation with tiled images.
Using CTM is the best way...
I would agree. In this case, there are no rotations that are not orthogonal or at right angles, and the scaling preserves the aspect ratio, it's a simpler set of cases, TIFF only has eight orientations, the simpler implementation is more efficient. Also, I'm lazy and this works already, or so it seems. If only it was good and worked on the tile sets I would think it implemented, maybe later.
Here are a couple questions. I have a matrix multiply function. Then, I have an imagebox, its coordinates are (0, 0), (x2, y2). Within it are tiles in a rectangular array, each with two coordinates, the lower left and upper right corner. How do I rotate the whole thing 90 degrees counterclockwise, where PDF theta is pi/2, with having the lower left corner stay at (0,0) and the upper right corner going to (y2, x2)? Hmm...
I determined a method through trial and error to correctly orient the tiles on the output. It doesn't multiply the transformation matrices, instead, it uses case-based swapping and negating of the values. Instead of a matrix multiply, with 27 multiplies and 27 additions, there are about zero to ten additions. Of course, it's only feasible because all the angles are orthogonal, multiples of 90 degrees, from zero degrees.
I added support for separated planar configuration, but not for images in JPEG, OJPEG, or with YCbCr downsampling of the chrominance components.
Now I'm trying to figure out how to improve colorimetry handling. I have to better handle associated and unassociated alpha per the ExtraSamples field. Basically, one is pre-multiplied. I think instead of discarding the alpha component I should subtract it from or add it to the RGB samples, for example in converting strike.tif from the samples pics. Another notion is that of the TransferFunction and TransferRange, with that is the GrayResponseCurve and GrayResponseUnits. The WhitePoint, PrimaryChromaticities, TransferFunction, TransferRange, and ReferenceBlackWhite fields go into parameters of the PDF CalGray, CalRGB, and Lab color spaces,somehow, in some way to do with CIE XYZ, LMN, ABC, and other various letters of the alphabet in sequences with arithmetical operations upon them. Luckily, I think I can write the ICC stream directly into the PDF file with knowing absolutely nothing about it.
I read TIFF 6.0 and PDF 1.5. There are several things to do with colorimetry support in TIFF 6.0 that unsurprisingly uncoincidentally are named the same thing in PDF 1.5. Some are not. There are a variety of fields in TIFF that have analogs in PDF:
The above are related to the transfer function. The transfer function maps pixels values for a channel to the range 0-65535, a 16 bit unsigned integer. In PDF, it's 0.0-1.0. For each value of the pixel, eg 0-255, it is mapped to an intensity, in PDF 0.0-1.0. In that way it's kind of like a color map, except the map is from one channel to itself and the function is order preserving. This is expressed in the PDF by a transfer function, in this case a PDF Type 0 function or interpolated sample function. It is then referenced in the drawing context by reference to the transfer function as a dictionary within the ExtGState resources of the page, for use with the PostScript gs grphics operator. For an image of N components, samples per pixel or spp, the transfer function might specify one or N many transfer functions, with one function each component uses the same function, otherwise each uses its own function. The GrayResponseCurve is defined in TIFF 5.0 or earlier, and it is noted that its function is much the same as the TransferFunction.
These tags have to do with dithering the image, that is, for example, using dense collections of dots for a black and white printer to approximate gray colors. Near, it looks like dots, far, like the item. The HalftoneHints describe the lower and upper ends of the range, below the value the is maximally tinted, and above the value is the maximum highlight. The Thresholding tag indicates the dithering process. The CellWidth and CellLength have to do with the dithering process. The cell is the block of nxm pixels tiled across the image like a tile that can take various values from each dot being filled to each dot being empty. In the dithering, where there is zero screen angle of the dither, in PDF the cells are squares, but differing tags in TIFF seems to imply that the cells might be rectangles. PDF has a very comprehensive halftoning model, as described in section 6.4 of PDF 1.5, with spot functions, cell frequencies, and screen angles. To correctly implement the conversion of the TIFF into the PDF correctly preserving this aspect of the image, I am not sure how to do. I look to the jim*.tif images in the samples pics. The jim___ah.tif file has Thresholding tag set to 1, an ordered dithered scan. It does not have the CellWidth or CellLength fields set, or libtiff does not handle them, and there are not defaults. The HalftoneHints field is not set. The jim___cg.tif has HalftoneHints set to 203, 8, with photometric interpretion min-is-black. The file jim___gg.tif has HalftoneHints set to 1, 254 with photometric interpretation min-is-white. The dithered image should always be min-is-white, as the intensity is the inkdot intensity. The jim___dg.tif has no HalftoneHints or Thresholding tag, their other fields are equal.
For PDF, generating the Halftone image has to do with expressing a Halftone dictionary, that as an operand of the HT PostScript graphics operator. I think the Halftone dictionary is of a Type1 or Type6 Halftone dictionary, with there only being one component defined for TIFF 6.0 in terms of halftoning.
I think libtiff might not have support for all of these fields, in reading and writing them, yet. I've been looking around to see where to add them, and working on figuring out the tag and field system.
So anyways, there's the dithered TIFF image, it has one component and either ordered dither or randomized diffusion dither. I'm not quite sure, or rather, I have little idea, of what spot functions of the PDF halftoning process those would use, based upon the dithering process. The cell width and length, where known, can be used to determine from the resolution of the image the halftone cell frequency The HalftoneHints become the PDF threshold values. I'm hoping that someone here will help to explain how to use the information in the TIFF to generate correct PDF.
I think DotRange has to do with HalftoneHints, for CMYK images.
Another feature of the colorimetry has to do with the calibrated colorspaces, PDF's CalGray, CalRGB, Lab and ICCBased color spaces. The PDF document, in section 4.5, describes its rendering model where it converts the input data into a 1931 CIE XYZ colorspace which is then converted to the output, for screen, RGB. TIFF defines its RGB colorimetry tags:
The WhitePoint is for the grayscale and RGB images, it is the chromaticity of the image where each of the primaries has its reference white value. The PrimaryChromaticities define the chromaticity of the primary where it is at its reference white and the other components are at their reference black. The ReferenceBlackWhite field contains min and max values for each component defining where begins the "footroom" and "headroom" of the component, below and above which the pixel assumes the minimum or maximum intensity.
PDF has references to this information in the CalGray, CalRGB, for setting a WhitePoint, BlackPoint, Gamma, and also for CalRGB a matrix of nine values, having to do with specifying a linear transform for translation from ABC, RGB, to XYZ.
TIFF also uses the 1931 CIE XYZ, but it says it uses only the XY values or something. See TIFF 6 page 82. Noted on page 83 are some differences from TIFF 5, including defaults. From the description of WhitePoint, "The value is described using the 1931 CIE xy chromaticity diagram and only the chromaticity is specified." The sample values it gives are pretty far off from 1.0 which is the default for the PDF WhitePoint. There is not a TIFF BlackPoint field, the PDF default is [0.0 0.0 0.0].
The key is getting the PrimaryChromaticities into the CalRGB Matrix field.
The ICC profile should be simple to copy over, assume it is a display profile and copy it into a stream for the ICCBased color space with setting the number of components and alternate based upon the photometric interpretation.
Advice with regards to these issues is appreciated.