2008.01.24 05:23 "[Tiff] TIffcrop might be called tiffextract", by Richard Nolde

Andy,
In truth it ought to be called Tiffmogrify since it rotate, flip,
crop, and extract all in one pass. For bilevel images it can also
invert the bits too. I would have to check the source to be sure, but I
believe that you can have at least 6 separate zones per image. The
number could easily be increased with a recompile. Since it is derived
from tiffcp, it works with images arranged as tiles or scanlines and
it's selection of images within a multipage set is much more flexible
than tiffcrop. I think I did drop the ability to subtract a bias image
from each page when I added the other items. The only area in which I
know it to be deficient is in extracting multiple vertical zones eg -E
left or right instead of top or bottom. The regions are extracted
correctly, but when they are written to the output file, they are
aligned one below the other instead of one beside the other. I know how
to fix this but I haven't had the time. I also want to try some
alternative algorithms for processing non-byte aligned selections from
single bit images to see if they would be any faster. When I get the
time to do this, I will fix the vertical slice problem as well.

My code128 reader uses the zone and rotation features that I wrote for Tiffcrop to find the bar codes, which I print on documents assembled from data in our database (using an enhanced version of gnu barcode that I wrote, also freely available to anyone who is interested), regardless of which orientation the page was loaded in the scanner. I don't actually crop the image, I just calculate the boundaries of the regions of interest and confine my searches to the appropriate portions of the buffer by pointer math. The reader has the ability to take specific actions based on command line arguments for patterns that indicate valid codes to include, exclude, or use as separators. This guides the generations of multiple separate files from the stack of pages in a single pass. We store the processed files in PDF format so I then convert them to PDF after they are indexed in the database. Originally I had hoped to incorporate some of the logic of tiff2pdf into the utility so as to do it in one pass, but a study of Ross Finlayson's code convinced me that it was a bit more than I wanted to do if I needed to support more than G4/G4 compressed TIFFs we use. At some point I might have a go at adding PDF export support to the bar code reader since that only involves bilevel images and I could ignore many of the color related issues.

If the maintainers are willing to accept a name change, I am open to
suggestions. Tiffcrop, tiffmogr, tiffknife, tiffcarve, tiffhack?

Most high volume copier /scanners create only one strip per image for
G3/G4 compressed images but I read one scanline at a time. For my
application, which involves files of 250 to 300 pages of mixed letter
and legal size, it is OK to read an entire image (page) at once, but if
you had huge images, you might want to read only the scanlines of the
original that you want to process and write as you go. My need is for
quick rotations for scanning the data looking for barcodes so it would
not be efficient to read it multiple times in different orders to get at
different parts of the file. I have to write the entire page out in the
orientation that shows the bar code and text correctly regardless of how
it was loaded into the scanner even if I am only searching through part
of it. Also any rotation by 90 or 270 degrees with G3/G4 data is going
to require the whole image to be in memory to avoid massive thrashing of
the disk as you read a column out of each scanline. In a VERY memory
constrained system, you might read 8 scanlines at a time and write out
to a full sized buffer for 90/270 rotations, but I'm not that compulsive
and the chances of getting it wrong are too great.

Richard

Andy Cave wrote:
> Hi Richard,
>

Thanks for the detailed reply - is interesting to read - I didn't know that tiffcrop could do all that. You should call it tiffextract really, not tiffcrop.

BTW you can do random seeks in files with G3/G4 compressed data, but only to band boundaries. We do this in our FirstPROOF software. I can't remember without looking if this is with the scanline interface or not though.

FYI I recently added a barcode reader to FirstPROOF (we support reading & checking of EAN/UPC-A barcodes - at any zoom size (but clearly you need a 'space' between the 'bars') - including calculating the BWR factor and size). It was really fun investigating the subject of barcodes and implementing a reader, never having looked at them before.