![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
||
AWARE [SYSTEMS] | ||||||||
![]() |
TIFF and LibTiff Mailing List Archive | |||||||
LibTiff Mailing List
TIFF and LibTiff Mailing List Archive Contact
The TIFF Mailing List Homepage |
Thread2019.01.13 23:12 "Re: tiffcp altering image contents (in contrast to what the manual says)?", by <lists@binarus.de>Dear Richard, thank you very much for your impressive answer. On 13.01.2019 22:11, Richard Nolde wrote: > Bob is certainly correct in stating that the issue is that the output is > written using YCBCR encoding. One of my previous messages shows the output from tiffinfo for each of the source files. tiffinfo shows that the source files are indeed written using YCbCr encoding, so this is true. But I don't understand why this imposes an issue. Since OJPEG does not seem to be a problem in this case, why does tiffcp (obviously) alter the image data? Why doesn't it just copy the YCbCr encoded data byte by byte when merging the images (just altering the directory, endianness, offsets, ... accordingly)? > Why not simply use another compression > algorithm and why use Graphics Magick at all if you are just compression > and combining them? Two questions in one sentence :-) We can't use another compression algorithm for the 24 BPP files because they will get huge if we do. One one hand, we can afford the degradation which is caused by encoding as JPEG with 90% quality if the degradation is guaranteed to happen only once. On the other hand, the size of those files will be at least 5 times their current size if we use any other compression than JPEG. Since we will have to handle some 100000 of them, this is a problem. The current compression scheme is well-crafted and approved. The problem is that degradation must only happen once. We couldn't accept that tiffcp would re-encode the image data, and I therefore would like to understand exactly what is happening here and why tiffcp obviously touches the image data at all. Likewise, we don't want to use another compression scheme for the 1 BPP images because ZIP has turned out to be the most efficient for our images, and as a bonus, it is lossless. It is no option to use JPEG compression for this image type, because it would simplify things, but would make the images 10 times their current size. > Tiffcp or tiffcrop can compress an uncompressed file > on the fly while making a multi-page TIFF from a series of uncompressed > single page TIFF files. I think that this is a very good idea. It will need thorough testing, though, and replacing gm by tiffcp will cause the whole process to have to be re-approved, documented and so on. > If your files have various bit depths for which > the same compression algorithm cannot be used, you would have to first > combine all of them at a given bit depth and specify a compression > algorithm that is appropriate for that bit depth. After the subsets are > combined, ie, all the 1 bit bilevel images with CITT Group3 or Group4 > encoding, all the RGB images with Zip or LZW or ZIP, you call tiffcp > with no compression algorithm specified to build the final version. By this, do you mean to just leave away the "-c" switch, as I did? The manual says (in the section where the -c switch is explained): "... By default tiffcp will compress data according to the value of the Compression tag found in the source file." Maybe there is a general misunderstanding at my side. Why does tiffcp (re-)compress data at all? As far as I have understood, at least if we don't specify a compression explicitly, it should just copy the image data from the source to the destination WITHOUT decompressing to memory first and then re-compressing when writing the output. Of course, I have understood that for every type of lossless compression this makes no difference (besides memory consumption). But for JPEG-encoded images, it makes a difference. We are searching for a method which allows us to take an uncompressed 24 BPP image, compress it to JPEG >>> one time <<<, and merge it with other images >>> without further degradation <<<, i.e. without decoding and re-compressing it again. I am becoming more and more unsure if this is possible at all. > If, as I suspect, you want them in a specific order, you run tiffcp once > on each file (or group of files with the same bit depth) to do the > compression and then once on the entire list of compressed files without > modifying the compression. This is like the method we currently employ, with the difference that we currently use gm instead of tiffcp to do the first compression (grouping) step. We can use tiffcp instead of gm, but that eventually wouldn't change much: In every case, we would have to make sure that data of an image which is JPEG-compressed does not get JPEG-compressed a second time if we merge that image with another one. > > Your script can run tiffinfo over each file and grep for the word > Sample. Based on the result, you can set the compression used in the > subsequent call to tiffcp in the first pass which writes out a temporary > file X.tmp.tif into a separate directory. Once all the files have been > compressed into the temporary directory, you call tiffcp again on the > temporary files and write the multi-page result file wherever you like. This would be very elegant - thanks for proposing it. But it all depends on whether we can make tiffcp join two TIFF files which already are JPEG-compressed without further loss, i.e. without decoding and encoding them a second time. Thank you very much for your advice and expertise, Binarus _______________________________________________ Tiff mailing list Tiff@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/tiff |