2004.11.04 21:10 "Re: [Tiff] is libtiff thread-safe?", by Joris Van Damme

It seems that libtiff is currently thread-agnostic. It doesn't do anything special to support threads, however, it is not a disaster either since its implementation is mostly re-entrant.

There's a concept self-threading that some use, that seems to mean objects encapsulating all needed synchronization, such that any number of threads may do anything with that object at any one time.

Another common flavor of meaning of the word 'thread-safe' is that any number of threads are allowed to use a library at any one time, but the application should make sure that the single same 'object' (or TIFF *, or whatever has similar functionality) can only be used by one thread at any one time. Meaning, essentially, add a critical section or mutex, per main 'object', or some particular application logic, and no problem.

Some libraries claim to be thread-safe, and next go on to document initialization and finalization functions that both must be called only once. That means these functions cannot be called by all threads independently, that extra application logic is needed (eg embedding initialization and finalization inside single main thread at application initialization and finalization time), or that extra synchronization is needed (eg a critical section or mutex that protects a flag that signals whether initialization is done,...).

My point: the word 'thread-safe' applied to any library, is mostly non-sence. Speaking of anything beyond the most atomic bit of data, there is no such concept as a simple bilevel thread-safeness. Instead, we should try to figure out what rules exactly need to be respected to use LibTiff in a thread-safe fashion, what issues exactly there are.

Here's an attempt at starting that, but I don't claim to be absolutely certain about anything.

** Error handling **

Bob pointed out error handling may be a problem. My personal opinion is that error handling is a problem, but the problem is not multi-threading related. The LibTiff error and warning handlers don't receive any indication as to what particular TIFF * the errors and warnings apply to. That is a problem in any application that uses multiple TIFF *. I've come to the conclusion that most LibTiff users that have this problem, simply avoid it by ignoring errors and warnings on that end, and only check return values of the LibTiff calls. The price they have to pay is that they have no text description of what error happened exactly, and also they cannot have any meaninfull handling of warnings at all. 'The file could not be opened because an error occured while opening the file.'

The problem is not multi-threading related, in that it is not any different whether you use a single thread or lots of them. On the contrary, the use of multiple threads may actually ease the problem a bit, as Bob pointed out when this was discussed a while ago, since in the particular case that an application uses no more then one TIFF * per thread, threadvars can be used for the pointers to error and warning handlers, and thus the situation becomes the same as if there were only a single TIFF * and text descriptions of errors and warnings can be 'localized', associated to the correct TIFF *.

** RGBA interface and in particular color convertion initialization **

This may or may not be a problem, I'm not knowledgable on this subject. In particular, LibTiff may or may not be building some LUTs, and it may be so that it holds on to these LUTs once they are build. In that case, there may be a problem. But I wouldn't know.

** Using the TIFF *, above mentioned issues aside **

I believe only one thread should access a TIFF * at the same time. This should be a consequence of application logic, or should be ensured by protecting access to any TIFF * with a critical section or mutex per TIFF *. Under this condition, I believe there is no problem.

** Example of thread-safe usage **

Suppose the application ignores error and warning text description, ie the handlers are dead ends, and the application merely checks return values to see if an error occurs. Furthermore assume that the application does not use the RGBA series of functions, uses the strip/tile interface instead, and assumes responsability for handling any needed color convertion outside of LibTiff.

If such an application has worker threads. Each thread independently opens a TIFF file, and extract tag values and raster data, does some operation to it, storing the result in a raster object, closes the TIFF file, and posts the raster object result to the main thread for display.

Meaning, there are different threads, each uses the LibTiff library and owns one or more TIFF *; but no single TIFF * is used in multiple threads at the same time, by application design.

I believe that is absolutely safe.

** Example of threading problem **

Suppose a similar application wants to use a similar scheme of different threads, but each is working on a different page in a single TIFF *.

I believe LibTiff design is not suitable to meet this need directly. Instead, the application should be designed such that each thread openes, manages and eventually closes its own TIFF *, even if all these TIFF * access the same file.

** How that threading problem can be important **

So here I am buying a quadruple processor, thinking I'll be able to decode a tiled TIFF four times as fast because each processor can simultaniously work on its own tile.

Wrong. Such a scheme will involve at least four TIFF *. All of these have to be initialized (opening = opening + reading first IFD + jumping to desired IFD and reading that one). Chances are this overhead is quite big compared to the decoding of a tile...

** How, hypothetically, another LibTiff design could do better **

Another LibTiff design could have been to not use the same object for both global file stuff handling, particular IFD handling, and particular thread handling. Instead, a TIFF codec library could be designed with dedicated objects for each of these. A file object could support 'spawning' any number of IFD objects, and IFD object could support 'spawning' any number of tile objects. Above quadruple processor design could then actually work quite well.

** Bottom line **

There is no single short answer to this question, there is no such thing as a simple bilevel thread-safeness concept. Instead, both application and LibTiff design should be analyzed to see if/how it works, if it is in need of some extra synchronization handling or a little detour such as using multiple TIFF * on the same file, or whatever. In other words: an application specific discussion will be needed, there is no universal answer.

Joris Van Damme
info@awaresystems.be
http://www.awaresystems.be
Download your free TIFF tag viewer for windows here:
http://www.awaresystems.be/imaging/tiff/astifftagviewer.html