AWare Systems, , Home TIFF and LibTiff Mailing List Archive

LibTiff Mailing List

TIFF and LibTiff Mailing List Archive
January 2019

Previous Thread
Next Thread

Previous by Thread
Next by Thread

Previous by Date
Next by Date


The TIFF Mailing List Homepage
Archive maintained by AWare Systems

New Datamatrix section

Valid HTML 4.01!


2019.01.13 21:11 "Re: Tiff Digest, Vol 3, Issue 5", by Richard Nolde
2019.01.13 23:12 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by <>
2019.01.14 17:44 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by Daniel Mccoy
2019.01.14 17:45 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by Daniel Mccoy
2019.01.15 08:20 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by <>
2019.01.15 08:18 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by <>

2019.01.13 23:12 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by <>

Dear Richard,

thank you very much for your impressive answer.

On 13.01.2019 22:11, Richard Nolde wrote:
> Bob is certainly correct in stating that the issue is that the output is
> written using YCBCR encoding.

One of my previous messages shows the output from tiffinfo for each of the
source files. tiffinfo shows that the source files are indeed written using
YCbCr encoding, so this is true.

But I don't understand why this imposes an issue. Since OJPEG does not seem
to be a problem in this case, why does tiffcp (obviously) alter the image
data? Why doesn't it just copy the YCbCr encoded data byte by byte when
merging the images (just altering the directory, endianness, offsets, ...

> Why not simply use another compression
> algorithm and why use Graphics Magick at all if you are just compression
> and combining them?

Two questions in one sentence :-)

We can't use another compression algorithm for the 24 BPP files because they
will get huge if we do. One one hand, we can afford the degradation which is
caused by encoding as JPEG with 90% quality if the degradation is guaranteed
to happen only once. On the other hand, the size of those files will be at
least 5 times their current size if we use any other compression than JPEG.
Since we will have to handle some 100000 of them, this is a problem.

The current compression scheme is well-crafted and approved. The problem is
that degradation must only happen once. We couldn't accept that tiffcp would
re-encode the image data, and I therefore would like to understand exactly
what is happening here and why tiffcp obviously touches the image data at

Likewise, we don't want to use another compression scheme for the 1 BPP
images because ZIP has turned out to be the most efficient for our images,
and as a bonus, it is lossless. It is no option to use JPEG compression for
this image type, because it would simplify things, but would make the images
10 times their current size.

> Tiffcp or tiffcrop can compress an uncompressed file
> on the fly while making a multi-page TIFF from a series of uncompressed
> single page TIFF files.

I think that this is a very good idea. It will need thorough testing,
though, and replacing gm by tiffcp will cause the whole process to have to
be re-approved, documented and so on.

> If your files have various bit depths for which
> the same compression algorithm cannot be used, you would have to first
> combine all of them at a given bit depth and specify a compression
> algorithm that is appropriate for that bit depth. After the subsets are
> combined, ie, all the 1 bit bilevel images with CITT Group3 or Group4
> encoding, all the RGB images with Zip or LZW or ZIP, you call tiffcp
> with no compression algorithm specified to build the final version.

By this, do you mean to just leave away the "-c" switch, as I did? The
manual says (in the section where the -c switch is explained):

"... By default tiffcp will compress  data  according  to  the value of the
Compression tag found in the source file."

Maybe there is a general misunderstanding at my side. Why does tiffcp
(re-)compress data at all? As far as I have understood, at least if we don't
specify a compression explicitly, it should just copy the image data from
the source to the destination WITHOUT decompressing to memory first and then
re-compressing when writing the output.

Of course, I have understood that for every type of lossless compression
this makes no difference (besides memory consumption). But for JPEG-encoded
images, it makes a difference. We are searching for a method which allows us
to take an uncompressed 24 BPP image, compress it to JPEG >>> one time <<<,
and merge it with other images >>> without further degradation <<<, i.e.
without decoding and re-compressing it again.

I am becoming more and more unsure if this is possible at all.
> If, as I suspect, you want them in a specific order, you run tiffcp once
> on each file (or group of files with the same bit depth) to do the
> compression and then once on the entire list of compressed files without
> modifying the compression.

This is like the method we currently employ, with the difference that we
currently use gm instead of tiffcp to do the first compression (grouping)
step. We can use tiffcp instead of gm, but that eventually wouldn't change

In every case, we would have to make sure that data of an image which is
JPEG-compressed does not get JPEG-compressed a second time if we merge that
image with another one.

> Your script can run tiffinfo over each file and grep for the word
> Sample. Based on the result, you can set the compression used in the
> subsequent call to tiffcp in the first pass which writes out a temporary
> file X.tmp.tif into a separate directory. Once all the files have been
> compressed into the temporary directory, you call tiffcp again on the
> temporary files and write the multi-page result file wherever you like.

This would be very elegant - thanks for proposing it. But it all depends on
whether we can make tiffcp join two TIFF files which already are
JPEG-compressed without further loss, i.e. without decoding and encoding
them a second time.

Thank you very much for your advice and expertise,

Tiff mailing list