| AWARE [SYSTEMS] | Imaging expertise for the Delphi developer | |||||||
![]() |
TIFF and LibTiff Mailing List Archive | |||||||
LibTiff Mailing List
TIFF and LibTiff Mailing List Archive Contact
The TIFF Mailing List Homepage |
2003.11.20 05:16 "tiffcp analysis", by Ross Finlaysontiffcp analysis
I figure I should analyze tiffcp. After all, if I am complaining about
it crashing, I should fix it. Recently I noticed that it crashes
sometimes when compressing JPEG, or reading JPEG images, for example
tiffcp -w 128 -l 128 -c jpeg quad-lzw.tif quad-tile-jpeg.tif.
Also, I figure I can add the OJPEG to JPEG conversion into it so that if
gets an OJPEG file as input, that it could write a new JPEG file as
output, in some far-fetched ideal scenario.
Also, we could see how to copy all the tags without having to type them
in a list and manually update it as tag definitions change, as the tag
definitions are stored in the TIFF with an according TIFFFieldInfo
struct.
Also, it could use raw conversion, for example for merely appending
TIFFs instead of changing the structural details.
The entry point to tiffcp functionality is the main function. A variety
of its variables are stored in static variables. I think it would be
better to have a context or state struct with the variables, and pass it
to all the functions, because it would make it easier to use tiffcp
functions as library functions.
The main function processes its arguments.
Here is the list of the functions besides the macros, usage, and main:
tiffcp()
processG3Options()
processCompressOptions()
pickCopyFunc()
openSrcImage()
nextSrcImage()
cpTag()
cpStriptoTile()
cpSeparateBufToContigBuf()
cpImage()
cpContigBufToSeparateBuf()
The tiffcp(TIF*, TIFF*) functions accepts two open TIFF files, in and
out, for reading and writing, it copies the tags and data from in to out.
Back to main, main opens with variables for in and out, getopt
variables, tile and strip rows parameters, a directory offset, mode
array and pointer alias, and it thus progresses.
The b option specifies a bias image to be masked over the output. That
is the image processing option, the other options have to do with the
output file configuration: tile and strip configuration, planar
configuration, and compression.
The last argument of the argv argument to main is opened as out, the
output TIFF for writing. This is somewhat dangerous as something
accidental like tiffcp *.tif would copy all the files matching
specification over the last file matched by the specification, as I know
from overwriting a file. UNIX commands are explicit in their usage and
not necessarily designed to protect the users from their actions, and in
this case the file specification is done by the command shell, so glob
couldn't be monitored to warn or ask for confirmation of an errant
wildcard match.
For each of the file name arguments prior to the last one, the
openSrcImage is called with that char* and it returns a TIFF that in,
the input TIFF file, has assignment. The char** argument to
openSrcImage, which may call nextSrcImage to set a directory, there
using the syntax to select a directory, and if the specification is not
parsed, nextSrcImage calls exit(-4), which it notes as a syntax error,
instead of the more generic exit(EXIT_FAILURE).
So in has been opened for reading and set to a directory. Then there is
an endless for loop where functions for config ,compression, fillorder,
rowsperstrip, tilewidth, tilelength, and g3opts are called, then the
tiffcp function is called, then the next directory is set, or if no
more, the loop is breaked or returned, the input file is close, and the
next file is opened through openSrcImage.
I don't see where it calls TIFFClose(out), it just exit(0)'s. It calls
TIFFClose(out) if tiffcp(in, out) fails, and exit(1).
Then after the main function are defined a couple functions to process
the input arguments. Then, there is implemented a usage function. I
don't know is usage is always supposed to be called usage and be static
and void so that it can be called from an external program from loading
the object image and calling the usage function without calling main. I
haven't seen any usage of that type of usage. The usage function calls
the TIFFGetVersion, invoking it would probably cause the libtiff shared
or static object to be loaded or loaded and linked or whatever it is
a.out, or ELF, or PE/COFF, Mach/o, or what have you, do their things.
Then, there are the CopyField macros, with variously one, two, three, or
four arguments, they resolve to "if fTIFFGetField(...)
TIFFSetField(...)". When the tag and types of arguments are known, they
are handy.
The next function is a cpTag, presumably for "copy tag", Based on a data
type argument, there is a switch argument on the data type to call the
CopyTag macros on temporary or working variables of names of the
variable type with v for value, eg floatv, or av for array value, eg
floatav. I think the way it uses the variable represents about as
standard as libtiff field definitions get, considering we are using C: a
typed language where everything is an int.
After the cpTag function is a static array of cpTag structs called
tags. Its elements are a tag, like TIFFTAG_IMAGEWIDTH, the number of
values and the type of values for the tag, where the number, or rather,
count, is an integer and the type is one of the libtiff definitions of
the enumeration of the TIFF field types as specified in TIFF 6.0,
extended from unsigned ints and ASCII strings of TIFF 5.0, ahhermmm.
The static struct does not end with a NULL entry, instead a macro NTAGS
is defined that divides the size of the structs by the sizeof a struct
thus that entries are readily added to the list without specifying the
count of the specified tags.
This leads me into a brief aside about a preprocessor extension I would
like to see, given a const char* "s" it expands to the "s", strlen(s),
except it runs strlen itself: "s", 1, for something like write
( str("s"), out) expanding to write("s", 1, 1, out). Anyways.
Where above the CopyField macros were defined that work within the cpTag
function, then the CopyTag macro is defined which calls cpTag with in,
out.
Then, there is implemented the tiffcp function. It copies image width,
length, bits per sample, and samples per pixel. Then, it copies
compression, and against compression copies compression scheme dependent
tags. I'm not quite sure yet, but I think there might be better orders
to copy some of the tags for how the internal logic of TIFFGetField and
TIFFSetField must handle a variety of tag specifications and
implications of their logic.
The tiffcp function then copies over to the output photometric
interpretation, fill order as specified, orientation, rows per strip,
planar configuration as specified, transfer function, colormap,
compression scheme related tags, ICC profile, and a page number.
Then, a copy function is returned from the pickCopyFunc function,
variously from the input, output, length, width, and samples per pixel.
There are then macros defined to construct the cpFunc,readFunc, and
writeFunc.
/*
* Contig -> contig by scanline for rows/strip change.
*/
/*
* Contig -> contig by scanline while subtracting a bias image.
*/
/*
* Strip -> strip for change in encoding.
*/
/*
* Separate -> separate by row for rows/strip change.
*/
/*
* Contig -> separate by row.
*/
/*
* Separate -> contig by row.
*/
static void
cpStripToTile(uint8* out, uint8* in,
static void
cpContigBufToSeparateBuf(uint8* out, uint8* in,
static void
cpSeparateBufToContigBuf(uint8* out, uint8* in,
static int
cpImage(TIFF* in, TIFF* out, readFunc fin, writeFunc fout,
Then, those are the copy functions, there are the defined read
functions. Where the cpImage function has as arguments a read function
and a write function, it allocates a buffer to hold the complete image
and fills it with the read function and writes it to the output with the
write function.
DECLAREreadFunc(readContigStripsIntoBuffer)
DECLAREreadFunc(readSeparateStripsIntoBuffer)
DECLAREreadFunc(readContigTilesIntoBuffer)
DECLAREreadFunc(readSeparateTilesIntoBuffer)
DECLAREwriteFunc(writeBufferToContigStrips)
DECLAREwriteFunc(writeBufferToSeparateStrips)
DECLAREwriteFunc(writeBufferToContigTiles)
DECLAREwriteFunc(writeBufferToSeparateTiles)
Then are some more copy functions.
/*
* Contig strips -> contig tiles.
*/
DECLAREcpFunc(cpContigStrips2ContigTiles)
DECLAREcpFunc(cpContigStrips2SeparateTiles)
DECLAREcpFunc(cpSeparateStrips2ContigTiles)
DECLAREcpFunc(cpSeparateStrips2SeparateTiles)
DECLAREcpFunc(cpContigTiles2ContigTiles)
DECLAREcpFunc(cpContigTiles2SeparateTiles)
DECLAREcpFunc(cpSeparateTiles2ContigTiles)
DECLAREcpFunc(cpSeparateTiles2SeparateTiles)
DECLAREcpFunc(cpContigTiles2ContigStrips)
DECLAREcpFunc(cpContigTiles2SeparateStrips)
DECLAREcpFunc(cpSeparateTiles2ContigStrips)
DECLAREcpFunc(cpSeparateTiles2SeparateStrips)
Largely the copy functions implement the combinations of read and write
functions to assimilate and express the configuration of strips and
tiles, and separate planes of data and interleaved contiguous planes of
data.
Then, there is implemented the pickCopyFunc function which returns a
function pointer of one of those cp functions prototyped as a copyFunc.
The pickCopyFunc gets fields from the input and output TIFFs and rejects
unsupported cases. Then, there is some kind of truth table
implementation, some kind of poor man's finite state automaton, that
evaluates to a big switch statement returning the appropriate copyFunc
for the inputs and specifications.
I see a few trivial things to modify that don't affect the structure at
all, not all the functions are prototyped using the macros, for example.
Another issue is the fields that are copied, we want to extend libtiff
so that tiffcp handles all the fields without modification as the
library is extended.
Another issue is structural reorganization of the functions and some of
their arguments to reduce or completely remove static variables and
enable the tiffcp function to be more readily called from a stable
system program.
Another point would be to increase diagnostics, error and warnign
reporting, or even variable reporting around a static variable, or
variable of the state struct. By this point, I'm past having any
motivation to do it myself but I'm quite happy to offer direction.
That's not entirely true, I'd be happy to do it but just don't feel like
it right now.
Another thing is i had this concept about copying OJPEG into new JPEG
files, and also about copying JPEG data without uncompressing it. JPEG
data is mostly lossy data and uncompression and recompression almost
always introduces (more) visual artifacts into the image.
There could be matched read and write functions which accomplish this, I
think, I'm not quite sure yet, something to consider.
I don't see the function handling subdirectories generally, per TIFF
Tech Note 1. Subdirectories are often used to store reduced image and
image mask information, and are used in various high profiles as well.
Overall, I think it's sharp but I want to put all the static variables
in a struct passed to each function so it could be compiled into a
library and used in a safe multi-threaded fashion. Then, I guess I will
start trying to neaten it and make consistent its organization for that
there could be ready implementation of more general ideas as basic image
processing. Someone might want to deskew their images, or otherwise
filter the data between read and write, as the bias mask example does,
almost tortuously.
It's not the point of making life difficult, it's of making it easy.
The basic function that could be exposed would be the copy directory.
It would copy a directory and its subdirectories, where libtiff has
functional support, so I hear, for one level of subdirectories. It
would copy all the configured and extension tags via their TIFFFieldInfo
or special case handling for legacy tags.
Ross F.
|
|||||||