2008.10.31 18:51 "Re: [Tiff] Data ordering for planar config separate", by Richard Nolde
next sample. I'm trying to determine if images that contain multiple strips for each sample can be read by interleaving the reads from each sample within a loop for each row of the image or whether that violates the random access prohibition for compressed data. I'd be reading sequentially within each
Scanlines and strips are padded to byte boundary alignment.
In general, I have found it wise to add a few bytes of padding to the buffers allocated by the original routines in tiffcp to prevent access violations when reading bps > 8, for example: scanlinesize + (spp * (bps + 7) / 8). Adding a few extra bytes is much simpler than treating the end of each row with special code even though there may not be any data in them.
Compression is at the scanline/strip/tile level so that they are all independent. This means you can retrieve in any order you like. Depending on how the data is stored on disk, some retrieval patterns will be faster than others due to potential disk seeks.
I figured the disk IO penalty would be dependent on the layout but I don't have a good feel for which layout is more common and/or whether it is worth adding an option to specify the ordering of planes as RRR, GGG, BBB, AAA or RGBA, RGBA. The original read and write routines from tiffcp all use RRR, GGG, BBB, AAA. I would guess that anybody that takes the trouble to use plannar == separate probably wants to read one plane at a time. Comments on this welcome.
The tiffdump of my 32 bit per sample image below indicates that the
strips are being written RRR, GGG, BBB
This is likely an artifact of the order that GraphicsMagick is outputting the strips. Since GraphicsMagick is not planar internally, I have it outputting RRR, GGG, BBB strips. However, I could have done a pass through GM's image representation for each color plane and then libtiff likely stores all the red, green, and blue strips together. This would be slower from GM's standpoint (multiple passes through all the pixels) but faster from the standpoint of an application which natively uses planes, or wants to only retrieve one plane.
I'm using direct calls to write TIFFReadScanline and TIFFWriteScanline. The man page for the later states:
Write data to a file at the specified row. The sample parameter is used only if data are organized in separate planes (PlanarConfiguration=2). The data are assumed to be uncompressed and in the native bit- and byte-order of the host machine. The data written to the file is compressed according to the compression scheme of the current TIFF directory (see further below). If the current scanline is past the end of the current subfile, the ImageLength field is automatically increased to include the scanline (except for PlanarConfiguration=2, where the ImageLength cannot be changed once the first data are written). If the ImageLength is increased, the StripOffsets and StripByteCounts fields are similarly enlarged to reflect data written past the previous end of image.
All of the original routines for plannar == separate in tiffcp use the logic
for (sample = 0; sample < spp; sample++)
for (row = 0; row < length; row++)
TIFFWriteScanline... or TIFFReadScanline
I am also looking for images with Alpha channels or other spp > 3 to see if they work with my code too. I'm using an upper limit of 8 samples per pixel at the moment, which I assume would never be used except possibly for scientific data. Does anyone have evidence of spp > 8?
Did you figure out what the problem was for your bps == 1, eg tiger-rgb-strip-*01.tif images with the current GM download for Linux? They read fine with the high color version of GraphicsMagic on Winders, but not on Linux.
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/