2008.11.21 20:18 "[Tiff] Patch Legacy App for Updated libtiff?", by Kevin Myers

2008.11.22 04:10 "Re: [Tiff] Re: LZW compression in your legacy app", by Kevin Myers

If they are 'naturally noisy' like photographs then LZW would not help much either.

--Toby

Hi Toby,

My images consist primarily of *technical* documents that are *originally* created using a relatively *limited* number of colors. These documents are essentially extremely long strip chart recordings, sometimes using multiple colors for different data curves, symbols, annotations, etc. These images can be separated into two main groups for our dicussion here:

  1. Computer generated tiff files. - Strictly limited number of colors within a single image file, no noise or other variations, but colors often vary between different image files, especially those generated by different applications.
  2. Scanned images of paper documents. - Paper and ink variations, reproduction imperfections, scanner noise, limited scanner resolution, and other factors add extraneous color variations that were not intended to be present in the original document.
  3. Note that my only concerns in this discussion are with color and gray scale images. Black and white images are already handled quite well for purposes of the legacy application by using Group 4 fax compression.
  4. None of the images discussed here are really like photographs of natural scenes. The scanned images have many of the same problems as far as compression is concerned, but the possible solutions are quite different. In a photograph of a natural scene it is normally important to *keep* most of the detailed color variations in the images. With the document images that I am concerned with, the idea is to *get rid* of most of those extraneous color variations because they are simply printing, reproduction, and scanning artifacts that were never intended to be part of the document in the first place.
  5. In a "clean" image such as those generated directly by computer, there are reasonably large areas that consist of single colors, and there are also a lot of repeated color patterns for grid lines, etc. LZW compresses those kinds of image features fairly effectively. So I don't need to do much for those iimages beyond palettizing, as discussed earlier in this thread.
  6. The scanned images are much more difficult to handle. As you noted, LZW won't have much success on images with seemingly random color variations. That is why an appropriate color reduction algorithm is extremely important for the scanned images that I work with. An *ideal* color reduction algroithm would greatly reduce the color variability in the image, returning the image to *almost* the same representation as that of an image that was generated directly by computer, thereby rendering it much more compressible by LZW, and also amenable to palettization.
  7. Now the problem is just how to go about achieving or approaching such an ideal color reduction. I have already tried a number of color reduction algorithms in various graphics applications (e.g. GIMP, Corel, GraphicsMagick) with relatively limited success on the scanned images. The reduced color images were generally either very poor representations of the original colors, or did not reduce the number of colors enough to have the desired impact on compressibility of the images. Most of the algorithms that I tried either appeared to be too simplistic (e.g simple bit depth reductions), or placed too much emphasis on color occurance frequency as opposed to color space distance.
  8. I am working on an algorithm at the present time that I believe might work somewhat better. Although I have done some full time development work in the past, it was with 4GLs, and not too helpful for the much lower level C coding that needs to go into writing an efficient color reduction algorithm. And efficiency *is* quite important when dealing when multi-hundred megapixel images containing millions of distinct colors! Also, since programming is no longer the job that I get paid for, it is difficult to find enough hours in the day (especially *uninterrupted* hours) to make a lot of progress on a complex problem like this, especially when even processing just a single image may be quite time consuming. Of course I try to use smaller test images to help speed up testing, but those don't always adequately show how things will work out with the large images.
  9. Anyway, yes, I am well aware of the compression issue that you mentioned. Hopefully that is more clear now... :-)

Regards,

Kevin M.