2004.08.12 15:05 "Re: [Tiff] From bmp to Tiff", by Joris Van Damme

I am able to get bmp data and write to tiff as it is.... But is there way I can cut down the size of Tiff size... For example... say input bmp is 2400 x 3300 at 300 dpi and i want 800 x 1100 size tiff out of it...

Bob,

I believe Andrey's policy will be to answer:

http://www.awaresystems.be/imaging/tiff/faq.html#q2

And he is very correct doing so. That is because you are looking in the wrong place. There is a very sharp distinction between a codec and an imaging library. The first serves the purpose of converting raster data (mainly) into a file compliant to some file format specification, and vice versa. The second serves the purpose of doing things with rasters (in the broad sense of the word), like dithering, color conversion, color operations, filter operation, and, indeed, resampling.

That is why, for this task, you'll first need image processing code to take care of the resampling. Only in a second stage, you can depend on LibTiff to do the encoding to the TIFF file format.

This being said, I really feel like writing today. ;-)

What you are after is called resampling (because of the fact that these 'pixels' you process are but color samples at specific points in what is really a continuous plane). Scaling down is called downsampling. There's two main issues involved: downsampling technique, and color space.

** Technique **

There's two well know techniques that I'll mention. (Sure, there are others, but these are mathematically less correct, are the equivalent of downsampling correctly + applying an additional filtering technique, and as such may result in better looking images but should be explored only after understanding the two I will mention.)

The first is called 'nearest neighbour'. It simply means that, for each destination pixel, you calculate the exact coordinates in the source image, round that to the nearest source pixel, and use that color.

Taking your example, and calculating the color of destination pixel (x,y).

as (x,y) are indices, not positions, you'll need to first calculate true positions for the 'middle point' of this pixel from them. Thus, you end up with the values (x+0.5,y+0.5)
next, what you do is getting it 'out of' the destination scale, and 'applying' source scale. That's how the values become ((x+0.5)/800*2400,(y+0.5)/1100*3300)
next, from that position in the source image, you can calculate the indices of the pixel that contains that position. You end up with (Truncate((x+0.5)/800*2400),Trunc((y+0.5)/1100*3300))

Of course, that's just theory, it's a long way from a working and efficient algorithm.

Note that this nearest neighbour algorithm, in the case of downsampling, 'throws away' a lot of pixel values. That can be a problem, if, say, there are sharp one-pixel wide lines that are visually important, like eg in a table rendering or technical drawing.

That is why the second algorithm 'pixel area averaging' is generally prefered. It depends on the same position mapping from destination to source. But instead of mapping just the middle point of the destination pixel, you map each of its four corners. For instance, the pixel with indices (0,0) has corners (0,0), (0,1), (1,0), (1,1). When mapping these four corners to the source, you get a square (or rectangular) region of the source. Now average all color in this region. For instance, if the source rectangle would be (0,0), (0,1), (1.5,0), (1.5,1), you would multiply the color of source pixel (0,0) with 1, and the color of source pixel (1,0) with 0.5, add these, and divide the result by the total 1+0.5. That is the destination color you are after.

Again, that is a long way from a working and efficient algorithm. Generally, to destile a good algorithm from such a theory, you need to ask yourself what calculations you are doing over and over again, and how you can rewrite things so that you do those calculations only once, storing their results in what is called a LUT that you can next use over and over again without recalculating them. If after that step there's still some floating point math left in the critical path, you also need to rewrite to use integer math instead. After that you got yourself something much faster already.

** color space **

Some people will dissagree, but in my opinion, 'blending' colors, or any color calculation like the 'pixel area averaging' described above, is an operation that only makes sense to the human eye. That is why, I think, you need a color space that makes sense to the human eye. CIE L*a*b* is most suitable in my experience. The further away from such a perceptually uniform color space, the less pleasing the results. For instance, RGB might still be relatively usefull for such algorithms, CMYK might not.

Now, I don't expect this to be very helpfull, because the subject is vast. But I hope that I was at least able to provide some pointers for further investigation. Bottom line is

you're looking in the wrong place
http://www.awaresystems.be/imaging/tiff/faq.html#q2
search on keywords like 'resampling', 'downsampling', 'nearest neighbour', 'pixel area',...

Joris Van Damme
info@awaresystems.be
http://www.awaresystems.be
Download your free TIFF tag viewer for windows here:
http://www.awaresystems.be/imaging/tiff/astifftagviewer.html