2003.02.27 14:23 "Re: newbie", by Greg Roberts
I'm new to the TIFF format as well, so ignore me if this makes little to no sense.
- Tags not defined in the spec are common, and are defined on page 8 of the spec. These private tags can be reserved by organizations so as to avoid a tag having two interpretations by different applications. Unless you're dumping tags or know specifically which private tag you're looking for, you should just skip over any unrecognized tags. (And if it's completeness you're after, you'll need to support thousands of private tags :)
- If you're trying to read the actual image data, it is stored either in strips or in tiles. There will be a set of N strips, each of which is ImageWidth * RowsPerStrip "pixels" in area. A tile is basically the same, except not necessarily ImageWidth wide, its dimensions are given instead by tags 322 and 323.
- For each strip (or tile) you need to read a set of "pixels" from the strip offset (tile offset) which will be given in the StripOffsets tag (273). Each "pixel" is actually a set of bytes, the number of which depends on the BitsPerSample and SamplesPerPixel. For example, if SamplesPerPixel were 3 and BitsPerSample were 8,8,8 you would need to read 3 bytes for each "pixel" in the strip. The first byte you read would be the red component, the second byte green and the third blue.
- Also keep in mind that the data read from the strip offsets is often compressed.
Hope this helps,
----- Original Message -----
From: Carter John-jcarte01
Sent: Wednesday, February 26, 2003 12:50 PM
I'm new to the tiff format and I have two questions and I'm sure both have been asked before.
- I was sent a ( what seems to be know as a Wang tiff ) file. it seems to have tags ( i assume there's more than one ) that aren't defined in the adobe spec http://partners.adobe.com/asn/developer/PDFS/TN/TIFF6.pdf. tag 32934 is the one I've questions about. since the offset is 0 I assume its referring to the first few bytes (~150 dec) at the start of the file but what to do with them, since their not "core" tags I assume they can be skipped but completeness would be nice.
- I'm writing an ocr project as a hobby and would like to be able to source tiff images as well as the bmp format that the program can currently read/write to. so far I've managed to process the tags in the file (baring the one above) and stored there in a struct ( C language ) of long and long* more or less. but what's next, the link referenced above seems to explain what the tags contain rather that what to do with them. does any one know of a site that explains how to get the data stored in the offsets into a matrix of size image_length * image_width.