For small files, we currently slurp the contents into memory before
looking for matches. If the system has Sys::Mmap installed, we could
mmap files that are too large to slurp and possibly see a performance
improvement. Maybe mmaping all files would be faster than slurping
since the former bypasses the file system, I'm not sure. Of course, if
the system doesn't have Sys::Mmap installed, ack would use the current
behavior.
My second thought was to split the subdirectories to search into two (or
more) sets. Then fork off an ack processes per set and run the searches
in parallel. The original ack process would then aggregate the results
as they came in from the child processes.
If nobody's tried these optimizations yet and doesn't see any problems
with them, I might look into sending some patches.
--
Michael
> If nobody's tried these optimizations yet and doesn't see any problems
> with them, I might look into sending some patches.
I'd thought about the parallel versions, but not the memmapped ones.
I didn't pursue because my assumption has been that ack is typically I/
O bound, not CPU bound. Verifying where time is being spent could
certainly be worthwhile.
Right now, my #1 concern is getting the plugin API working. If you
want to go ahead and play with either of these approaches, that'd be
fine, but I can't promise that the API, such as it is, won't change
considerably within the next month or two.
--
Andy Lester => an...@petdance.com => www.petdance.com => AIM:petdance