2005.09.23 21:11 "[Tiff] Additional Lossless Compression Schemes", by Frank Warmerdam

2005.09.27 04:25 "Re: [Tiff] Additional Lossless Compression Schemes", by Ron

On Tue, Sep 27, 2005 at 04:06:08AM +0200, Joris wrote:

> Ron,

Personally, if libtiff has good generic support for plugging in alternate compression schemes on demand, I don't really mind which are actually implemented at this stage. So long as its easy to add the ones that people really want (ie. enough to submit a patch for cvs now), or the one I really need to get at some wacko non-standard format that one of my clients apps has used... which we may or may not want to ever publish.

I think there is 'good generic support for pluggin in alternate compression schemes'. The only thing that's missing, is proper documentation. Currently, you need to study the existing code to figure out what goes where and why. But aside from that, the actual plugin functionality is there in the sense that it's enough to write a single code file and a single line outside that file registring your new compression codec in the library.

You seem to be concerned with something like 'locally valid compression schemes'. There is no such thing in TIFF. There are 'locally valid tag codes'. You can use any tag code from 65000 and upward to mean something only in your local setup, and you don't have to register. The downside is that these tags cannot be attributed any meaning outside of your local system. But there is no such thing for compression schemes, you do need to get yourself a registered code for that even if you don't plan interchange of files with the world 'out there', so it'll never be as easy as merely getting the plug-in into CVS.

As a programmer, I'm mostly concerned about the mechanism to (de)compress the data. That is what I need from libtiff. Standards, de-facto and real that determine what sort of compression to use and how to indicate that can evolve on top of that. I'm not current enough with tiff tag standards to offer meaningful input on that side of things today.

But I know what I want from the api ;-)

If you do plan interchange, then simply registring and writing the plugin is not going to be enough. You'll have to get the word out, too, and/or wait long enough for most software to upgrade to newer versions of LibTiff that include the plugin.

Yes, the problem of using images with unconventional compression is different from the problem of implementing support for that. But any new format will be 'unconventional' until tested and proven useful, so our chicken has to lay an egg somewhere before we can see if it might hatch.

If we explicitly add support for particular algorithms we will need to address this, if we generically add support for any of them, people can experiment and make informed decisions about which to favour with the energy to popularise.

Last, but certainly not least, many vendors seem unaware of the basic properties of TIFF and/or simply do not care about anything but their own reader part reading whatever crap it is their own writing part is writing. We've seen such things as offsets in data blocks to other data blocks, for example in the Makernote data, the datatype LONG being used for fundamentally mixed datatype stuff, respectable specs build by respectable people ignoring the fact that data blocks are and have to be relocatable for TIFF to work, and that tag scope is limited to a single IFD, etc. Such stuff fundamentally breaks TIFF. Hey, perhaps some people don't like to be reminded, but we've seen the TIFF 6.0 spec itself fundamentally break TIFF, with the very bad specification of what is now refered to as 'old-style jpeg-in-tiff' but still makes life hard and still is generated by some of today's software. Too much freedom and ease of pluging in new stuff, and we may very well soon end up with TIFF getting more broken then not, and basic stuff like concattenating pages and transcoding TIFF files and all not working anymore as a consequence...

I've posted here before about my desire for a bare bones TIFF parser that decomposes generic TIFF and allows handing off data extracted from IFD's for interpretation by other codecs, or extraction of TIFF data from them (eg. EXIF in jpeg, jpeg in TIFF etc.) so I won't repeat myself too much, but I think compression probably fits in there somewhere too.

Good defaults and good documentation of recognised standards is probably the best we can do to address the 'social' problem of interchangeable tiff data.

I think we probably will want something "better" as the size and number of images that people manipulate grows, so this is good to think about. Otherwise a shop somewhere will surely implement their own brand of better, perhaps without that vital step...

I personally don't believe any 5% better but essentially still flate kinda compression scheme is worth the trouble. You've noted quite correctly, that every now and then you read good stuff about any of the twelve dozen such flate on steroids schemes, but that none actually prove worthy and break through.

bzip2 is not a deflate variant as I understand it, though 'deflate' was from the work done by 'LZ'. Since processors seem to be getting faster at a faster rate than bandwidth is increasing, it is rapidly becoming quite standard (or at least an option) in some circles.

Whether it is a good idea to add support for it to TIFF really depends on whether someone has a real need for making the tradeoff it entails.

I couldn't say honestly that I have one (yet), but I can easily imagine situations where I or others might.

It's a whole other story for fundamentally new compression modes that we firmly believe are here to stay. I believe we should try and agree on integration for JPEG2000, and old and new JBIG.

I'm not really in a position to comment usefully on that either, except to say that what "we agree on" is probably less important than what "they need". If we don't anticipate and cater to the latter, "they" probably won't do us many favours by way of an elegant solution that we may later have to deal with anyway.

I understand your concern about the proliferation of incompatible (and worse, fundamentally inconsistent) TIFF'esque formats, but I think if the user wants to shoot themselves in the foot, it is the job of the library to make sure the bullet lands right where they aimed it. For something like this, where good standards are continually evolving on a solid foundation, that probably includes allowing them to load their own custom ammo to suit their own feet.

TIFF itself is so simple that even little golems made of beach sand (ie. hardware) can do it. There is no reason I see not to hard code those bits of it and leave the rest of the 'man made' problems for men to make and unmake again.

I'm hoping to have some time to put code behind that belief again soon.

cheers,

Ron