Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
sparsehash can shrink even with min_load_factor(0)
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  7 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
ShaunJ  
View profile  
 More options Nov 3, 9:19 pm
From: ShaunJ <sjack...@gmail.com>
Date: Tue, 3 Nov 2009 18:19:45 -0800 (PST)
Local: Tues, Nov 3 2009 9:19 pm
Subject: sparsehash can shrink even with min_load_factor(0)
When a large number of elements are deleted and a few more elements
inserted, the sparsehash can decide to "grow" to remove the deleted
elements, but when it calculates how many buckets it needs for the new
table without the deleted elements, it actually ends up with fewer
buckets than before.

In my case, I'm expecting a bunch of insertions followed by a bunch of
deletions followed by a bunch of insertions and so on, and the number
of buckets is constantly shrinking and growing. I presume that means
the hash function must be called for each element to find its new
bucket whenever the number of buckets changes.

Would it be possible for resize_delta to clean up deleted elements
without changing the number of buckets?

Cheers,
Shaun


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Craig Silverstein  
View profile  
 More options Nov 3, 10:05 pm
From: Craig Silverstein <csilv...@google.com>
Date: Tue, 03 Nov 2009 19:05:26 -0800
Local: Tues, Nov 3 2009 10:05 pm
Subject: Re: sparsehash can shrink even with min_load_factor(0)

} Would it be possible for resize_delta to clean up deleted elements
} without changing the number of buckets?

I think the right way to do this is to call
   max_load_factor(xxx)
and
   min_load_factor(xxx)

You can do min_load_factor(0.0) so we never resize smaller.  That
means we'll never clear out deleted elements.  I don't know if there's
a simple way to do that without also resizing the hashtable.  I guess
you could do something like:
   sparse_hash_map<...> new_map(mymap.size());
   for (it = mymap.begin(); it != mymap.end(); ++it)
     new_map->insert(*it);
   mymap.swap(new_map);

This is more or less what the code does internally on a resize.

But I don't know how necessary such behavior is.  My guess is that
will be slower than just calling min_load_factor(0.0) and just keeping
doing your inserts and deletes.  If you find you're getting too
clogged with deleted keys, you can call
  mymap.resize(0)
after you've done a bunch of insertions (so you're at the high end of
the spectrum), to keep the hashtable from getting smaller.

craig


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
ShaunJ  
View profile  
 More options Nov 4, 12:41 pm
From: ShaunJ <sjack...@gmail.com>
Date: Wed, 4 Nov 2009 09:41:47 -0800 (PST)
Local: Wed, Nov 4 2009 12:41 pm
Subject: Re: sparsehash can shrink even with min_load_factor(0)
On Nov 3, 7:05 pm, Craig Silverstein <csilv...@google.com> wrote:
> } Would it be possible for resize_delta to clean up deleted elements
> } without changing the number of buckets?

> I think the right way to do this is to call
>    max_load_factor(xxx)
> and
>    min_load_factor(xxx)

> You can do min_load_factor(0.0) so we never resize smaller.  That
> means we'll never clear out deleted elements.

...

Hi Craig,

sparse_hashtable can shrink even with min_load_factor(0); I've seen
this in action. If the number of deleted items make up more than half
the number of elements in the table, the logic of resize_delta can in
fact choose a smaller number of buckets than were allocated
previously. It's this behaviour that I'm trying to prevent.

Cheers,
Shaun


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
ShaunJ  
View profile  
 More options Nov 4, 1:20 pm
From: ShaunJ <sjack...@gmail.com>
Date: Wed, 4 Nov 2009 10:20:44 -0800 (PST)
Local: Wed, Nov 4 2009 1:20 pm
Subject: Re: sparsehash can shrink even with min_load_factor(0)
Hi Craig,

Here's a patch that prevents sparse_hashtable from reducing the number
of buckets if it's resizing `just to get rid of all the "deleted"
buckets that are clogging up the hashtable.' Currently, inserting an
element can cause the number of buckets to shrink.

Cheers,
Shaun

2009-11-04  Shaun Jackman  <sjack...@gmail.com>

        * google/sparsehash/sparsehashtable.h (resize_delta): Do not
        shrink the number of buckets when inserting an element.

--- sparsehash-1.5.2/src/google/sparsehash/sparsehashtable.h.orig
2009-05-12 14:22:48.000000000 -0700
+++ sparsehash-1.5.2/src/google/sparsehash/sparsehashtable.h
2009-11-04 10:17:21.058413000 -0800
@@ -555,7 +555,7 @@
     const size_type needed_size = min_size(table.num_nonempty() +
delta, 0);
     if ( needed_size > bucket_count() ) {      // we don't have
enough buckets
       const size_type resize_to = min_size(table.num_nonempty() -
num_deleted
-                                           + delta, 0);
+                         + delta, bucket_count());
       sparse_hashtable tmp(MoveDontCopy, *this, resize_to);
       swap(tmp);                             // now we are tmp
     }

On Nov 4, 9:41 am, ShaunJ <sjack...@gmail.com> wrote:


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Craig Silverstein  
View profile  
 More options Nov 4, 1:39 pm
From: Craig Silverstein <csilv...@google.com>
Date: Wed, 04 Nov 2009 10:39:26 -0800
Local: Wed, Nov 4 2009 1:39 pm
Subject: Re: sparsehash can shrink even with min_load_factor(0)

} Here's a patch that prevents sparse_hashtable from reducing the
} number of buckets if it's resizing `just to get rid of all the
} "deleted" buckets that are clogging up the hashtable.' Currently,
} inserting an element can cause the number of buckets to shrink.

Insert causing a shrink is desired behavior: it's because we never
resize a hashtable on delete (in order to preserve iterators an
pointers while deleting).

So I think your patch would mean, in effect, we never shrink the
hashtable regardless of the min_load_factor, which isn't ideal.

I think, though, it makes sense to add a check there that never
resizes lower when min_load_factor is 0.  I'll try to do something
like that for the next release.

craig


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Craig Silverstein  
View profile  
 More options Nov 4, 1:44 pm
From: Craig Silverstein <csilv...@google.com>
Date: Wed, 04 Nov 2009 10:44:29 -0800
Local: Wed, Nov 4 2009 1:44 pm
Subject: Re: sparsehash can shrink even with min_load_factor(0)

} So I think your patch would mean, in effect, we never shrink the
} hashtable regardless of the min_load_factor, which isn't ideal.

Never mind, I looked at the code more closely and see that your fix is
right, and my objection is wrong.  I'll do a bit of testing, and look
to get this into the next release.

craig


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
ShaunJ  
View profile  
 More options Nov 4, 2:45 pm
From: ShaunJ <sjack...@gmail.com>
Date: Wed, 4 Nov 2009 11:45:04 -0800 (PST)
Local: Wed, Nov 4 2009 2:45 pm
Subject: Re: sparsehash can shrink even with min_load_factor(0)
Great. Thanks, Craig.

    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google