2008.08.22 16:27 "Re: [Tiff] creating sparse files......", by Rogier Wolff
On Fri, Aug 22, 2008 at 09:44:53AM -0500, Bob Friesenhahn wrote:
I'm stitching kind of large panoramas. This results in big
intermediate files. On my last run, which took overnight to stitch,
> >I thought 42 Gb of free disk space would be enough. Wrong!
got over thrity files of over 1.2Gbytes, filling up the disk.
It turns out that most of the files contain lots of zeroes. On Unix this can be stored effciently by not issueing a "write" with a buffer
full of zeroes, but by seeking over the area. The operating system will act as if the area was filled with zeroes.
This is an interesting issue. While holey files seem like a panacea, there can be some drawbacks. They are best for files which are written just once (like core files) and not so good for files which are expected to be updated in place. For files which are updated in place, the updated hole is quite likely to increase disk fragmentation since now it takes more space and the space will need to be from some other place on disk.
Right. But my modification kicks in if explicit zeroes are written.
libtiff is a library that is commonly used by programs that need tiff input/output. How many of those are going to explictly write large amounts (you need at least 4k before the OS will leave a block open!), of zeroes and then edit the tiff file inplace?
My guess is that even an editor like "gimp" just opens and writes the file as it goes. So, do you expect any realistic user to be bitten by this?
Fragmentation behavior is quite filesystem dependent.
It is worth considering enabling filesystem compression, or using whole-file compression. Perhaps even just enabling normal TIFF compression (e.g. LZW) is sufficient to eliminate the long spans of zeros.
Yes, enabeling compression should work wonders. Somehow I'm stuck with an application suite which suddenly lost the option to pass the compression flags around.
Given that this is the case, having libtiff create sparse files for files that ARE sparse, seems like something that is useful in general. For example, in the application suite I'm talking about (hugin/panotools), someone might have decided that for the short-lived temp files, compression and decompression wastes CPU cycles.
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement.
Does it sit on the couch all day? Is it unemployed? Please be specific!
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ