Since I'm doing so much processing I wanted to watch where the
program is spending its time to make sure I'm not doing anything
silly. I enabled the profiler with debug(debug() | 24) and wrote
code to dump the profile every 2 minutes.
To my surprise size() was taking double the time of any other
function. I used size in two places in the code. I used it in a
functional manner to recurse through an array using sublist. And I
used it in my removal policy to check if the cache is full or not.
I looked at the Sleep source code and saw that size() for an array
works in O(1) time. No issue there. However I call hash.keys().size
() for a Sleep hash. And I do this for good reason. You delete
hashes in Sleep by nulling out an entry. However this entry doesn't
go when you set the value to $null. Periodically Sleep has to clean
it up. Whenever the keys of a hash are requested Sleep loops through
the key set and eliminates any entries with a null value. So keys(%
hash) operates in O(n) time and hence size(%hash) works in O(n)
time. This totally killed my performance unnecessarily. I'll say
this "feature" is going to stay, its part of the trade off with my
hash implementation.
However there is an easy workaround if you know you're not going to
null out entries (and hence you don't need the cleanup behavior).
# obtain size of a hash in O(1) time
sub hashsize
{
return [[$1 getData] size];
}
-- Raphael