AWARE [SYSTEMS]
AWare Systems, , Home TIFF and LibTiff Mailing List Archive

LibTiff Mailing List

TIFF and LibTiff Mailing List Archive
January 2019

Previous Thread
Next Thread

Previous by Thread
Next by Thread

Previous by Date
Next by Date

Contact

The TIFF Mailing List Homepage
Archive maintained by AWare Systems



New Datamatrix section



Valid HTML 4.01!



Thread

2019.01.13 21:11 "Re: Tiff Digest, Vol 3, Issue 5", by Richard Nolde
2019.01.13 23:12 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by <lists@binarus.de>
2019.01.14 17:44 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by Daniel Mccoy
2019.01.14 17:45 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by Daniel Mccoy
2019.01.15 08:20 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by <lists@binarus.de>
2019.01.15 08:18 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by <lists@binarus.de>

2019.01.14 17:44 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by Daniel Mccoy

It might be worthwhile to look at the output of "tiffinfo -s".
That will show the strip offsets and strip lengths.
If tiffcp were just compressing out unused gaps in the file, the
number of strips
and strip byte counts would be the same, but the strip offsets would change.
If this were the case, you wouldn't even have to make your multi-page
tiff to check,
you could just "tiffcp" each of the files individually then compare
the output of "tiffinfo -s"
for the before and after versions. If only the offsets change, then
the actual image data
is probably not the same and the file has just been "defragmented".

Why: Some tiff writing programs flush incomplete directories to the
file while writing.
As the directory grows in length with strips being added, it has to
keep being relocated to the end of file,
leaving unused gaps between some strips. This can happen with programs
which do not know
the whole image beforehand and want partial images to be recoverable.
(Renderer, scanner, ...)
If this is the case, then running the file through tiffcp essentially
would perform garbage collection
on the file, resulting in a smaller file with exactly the same data in it.


Dan McCoy - Pixar

On Sun, Jan 13, 2019 at 3:12 PM Binarus <lists@binarus.de> wrote:
>
> Dear Richard,
>
> thank you very much for your impressive answer.
>
> On 13.01.2019 22:11, Richard Nolde wrote:
> > Bob is certainly correct in stating that the issue is that the output is
> > written using YCBCR encoding.
>
> One of my previous messages shows the output from tiffinfo for each of the
> source files. tiffinfo shows that the source files are indeed written
> using YCbCr encoding, so this is true.
>
> But I don't understand why this imposes an issue. Since OJPEG does not
> seem to be a problem in this case, why does tiffcp (obviously) alter the
> image data? Why doesn't it just copy the YCbCr encoded data byte by byte
> when merging the images (just altering the directory, endianness, offsets,
> ... accordingly)?
>
> > Why not simply use another compression
> > algorithm and why use Graphics Magick at all if you are just compression
> > and combining them?
>
> Two questions in one sentence :-)
>
> We can't use another compression algorithm for the 24 BPP files because
> they will get huge if we do. One one hand, we can afford the degradation
> which is caused by encoding as JPEG with 90% quality if the degradation is
> guaranteed to happen only once. On the other hand, the size of those files
> will be at least 5 times their current size if we use any other
> compression than JPEG. Since we will have to handle some 100000 of them,
> this is a problem.
>
> The current compression scheme is well-crafted and approved. The problem
> is that degradation must only happen once. We couldn't accept that tiffcp
> would re-encode the image data, and I therefore would like to understand
> exactly what is happening here and why tiffcp obviously touches the image
> data at all.
>
> Likewise, we don't want to use another compression scheme for the 1 BPP
> images because ZIP has turned out to be the most efficient for our images,
> and as a bonus, it is lossless. It is no option to use JPEG compression
> for this image type, because it would simplify things, but would make the
> images 10 times their current size.
>
> > Tiffcp or tiffcrop can compress an uncompressed file
> > on the fly while making a multi-page TIFF from a series of uncompressed
> > single page TIFF files.
>
> I think that this is a very good idea. It will need thorough testing,
> though, and replacing gm by tiffcp will cause the whole process to have to
> be re-approved, documented and so on.
>
> > If your files have various bit depths for which
> > the same compression algorithm cannot be used, you would have to first
> > combine all of them at a given bit depth and specify a compression
> > algorithm that is appropriate for that bit depth. After the subsets are
> > combined, ie, all the 1 bit bilevel images with CITT Group3 or Group4
> > encoding, all the RGB images with Zip or LZW or ZIP, you call tiffcp
> > with no compression algorithm specified to build the final version.
>
> By this, do you mean to just leave away the "-c" switch, as I did? The
> manual says (in the section where the -c switch is explained):
>
> "... By default tiffcp will compress  data  according  to  the value of
> the Compression tag found in the source file."
>
> Maybe there is a general misunderstanding at my side. Why does tiffcp
> (re-)compress data at all? As far as I have understood, at least if we
> don't specify a compression explicitly, it should just copy the image data
> from the source to the destination WITHOUT decompressing to memory first
> and then re-compressing when writing the output.
>
> Of course, I have understood that for every type of lossless compression
> this makes no difference (besides memory consumption). But for
> JPEG-encoded images, it makes a difference. We are searching for a method
> which allows us to take an uncompressed 24 BPP image, compress it to JPEG
> >>> one time <<<, and merge it with other images >>> without further
> degradation <<<, i.e. without decoding and re-compressing it again.
>
> I am becoming more and more unsure if this is possible at all.
>
> > If, as I suspect, you want them in a specific order, you run tiffcp once
> > on each file (or group of files with the same bit depth) to do the
> > compression and then once on the entire list of compressed files without
> > modifying the compression.
>
> This is like the method we currently employ, with the difference that we
> currently use gm instead of tiffcp to do the first compression (grouping)
> step. We can use tiffcp instead of gm, but that eventually wouldn't change
> much:
>
> In every case, we would have to make sure that data of an image which is
> JPEG-compressed does not get JPEG-compressed a second time if we merge
> that image with another one.
>
> >
> > Your script can run tiffinfo over each file and grep for the word
> > Sample. Based on the result, you can set the compression used in the
> > subsequent call to tiffcp in the first pass which writes out a temporary
> > file X.tmp.tif into a separate directory. Once all the files have been
> > compressed into the temporary directory, you call tiffcp again on the
> > temporary files and write the multi-page result file wherever you like.
>
> This would be very elegant - thanks for proposing it. But it all depends
> on whether we can make tiffcp join two TIFF files which already are
> JPEG-compressed without further loss, i.e. without decoding and encoding
> them a second time.
>
> Thank you very much for your advice and expertise,
>
> Binarus
> _______________________________________________
> Tiff mailing list
> Tiff@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/tiff
_______________________________________________
Tiff mailing list
Tiff@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/tiff