| AWARE [SYSTEMS] | Imaging expertise for the Delphi developer | |||||||
![]() |
TIFF and LibTiff Mailing List Archive | |||||||
LibTiff Mailing List
TIFF and LibTiff Mailing List Archive Contact
The TIFF Mailing List Homepage |
Thread2008.08.22 15:45 "Re: creating sparse files......", by Rogier WolffOn Fri, Aug 22, 2008 at 10:11:42AM -0300, Toby Thain wrote:
> >static int isallzero (tdata_t buf, tsize_t size)
> >{
> > int i;
> > for (i=0;i<size;i++)
> > if (buf[i]) return 0;
> > return 1;
> >}
>
>
> This appears unnecessarily inefficient. It's going to be much cheaper
> to test whole longwords (or whatever data alignment will allow) than
> byte by byte. (I wonder if something clever can be done with MMX/SSE
> on newer chips?)
Yes, this is the "quick and dirty" implementation of this function.
The advantage of doing it this simple is that it will ALWAYS work.
No stupid bugs that if the only nonzero byte is in the partial long
just beyond the
Keep in mind that when this runs, the CPU will be getting whole cache
lines from memory. When the cache has been filled, the CPU runs at say
3GHz. Or three cycles per ns. It will likely run the loop three times
a nanosecond as well. The memory fetching that needs to go on, is
likely to cost on the order of tens of nanoseconds per cache line. DDR
still runs at 200 MHz, and you have a two cycle latency for reads,
right? That's around 10 ns. At that point you can START to recieve
data from memory. The whole cache line then takes a few more
cycles....
Anway, my philosophy is always FIRST get it to work, optimize later.
> >static tsize_t
> >_tiffWriteProc(thandle_t fd, tdata_t buf, tsize_t size)
> >{
> > if (isallzero (buf, size))
> > return (lseek ((int) fd, (off_t)size, SEEK_CUR));
> > else
> > return ((tsize_t) write((int) fd, buf, (size_t)
> >size));
> >}
>
>
> This penalises all writes to optimise a very special case. Not saying
> it can't be rationalised, but pros and cons can be debated.
And you're calling this an optimization. And I agree. However, in my
case saving a factor of 21 on disk space is i'd say "worth it".
I work in datarecovery. I modified our "read-the-data-from-the-disk"
program to check for zeroes. Of course it got the commandline flag
that allowed me to turn it off for disks that didn't have many zeroes,
so that things would go faster. Bad move. Or at least unneccesary in
hindsight. Performance is NOT influenced by the check for zeroes, and
the gain by not writing the zeroes is always noticable.
But, yes, officially it could cost you some CPU cycles, which are not
returned by saving on writes. (checking for zeroes and seeking is much
more efficient than just doing the write! It not only saves you disk
space, but also CPU cycles!)
Roger.
--
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement.
Does it sit on the couch all day? Is it unemployed? Please be specific!
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ
|
|||||||