IntSet

10 views
Skip to first unread message

Caleb Clausen

unread,
Mar 22, 2010, 5:33:54 PM3/22/10
to ruby optimization
Roger, didn't you say you had or could make an IntSet class in your
google_hash library? Somehow, this is a construct that I keep wanting
to have. How efficient is it, in time and space.

Roger Pack

unread,
Mar 23, 2010, 12:59:56 AM3/23/10
to ruby-opt...@googlegroups.com
> Roger, didn't you say you had or could make an IntSet class in your
> google_hash library? Somehow, this is a construct that I keep wanting
> to have. How efficient is it, in time and space.

Did you just want Fixnum's or Bignum? Did you want it in the hash library?
-r

Caleb Clausen

unread,
Mar 23, 2010, 1:34:04 PM3/23/10
to ruby-opt...@googlegroups.com

Well, I guess I'm interested in finding out about what the tradeoffs and
limitations are. Bignums are nice, but I'm really thinking about
efficient storage. I thought I remembered somewhere you (I think it was
you...) saying that space efficiency for integer sets should be 1 bit
per item in the set, but I may be misremembering. Google's docs aren't
being a lot of help to me, I'm afraid.

Roger Pack

unread,
Mar 23, 2010, 5:09:12 PM3/23/10
to ruby-opt...@googlegroups.com
> Well, I guess I'm interested in finding out about what the tradeoffs and
> limitations are. Bignums are nice, but I'm really thinking about
> efficient storage. I thought I remembered somewhere you (I think it was
> you...) saying that space efficiency for integer sets should be 1 bit
> per item in the set, but I may be misremembering. Google's docs aren't
> being a lot of help to me, I'm afraid.

Yeah if you use the GoogleHashSparseIntToInt it's said to use "2 bits
per entry" for the key,
so theoretically it's 2+4(key) + 4(value) per entry. Plus the GC
doesn't have to collect it :)

-r
http://goog-sparsehash.sourceforge.net

Caleb Clausen

unread,
Mar 23, 2010, 6:52:13 PM3/23/10
to ruby-opt...@googlegroups.com

<<Scratch head?>>

Ok, so are you saying its a total of 2+4+4=10 bits per integer stored in
the set? But that doesn't make a whole lotta sense... the 4s make me
think you mean 4 BYTES per key and value, which would add up as
4+32+32=68 bits per integer stored. (Which is a lot more bulky and less
appealing...)

Please help me and my little brain understand...

Roger Pack

unread,
Mar 24, 2010, 11:22:34 AM3/24/10
to ruby-opt...@googlegroups.com
>> Yeah if you use the GoogleHashSparseIntToInt it's said to use "2 bits
>> per entry" for the key,
>> so theoretically it's 2+4(key) + 4(value) per entry.  Plus the GC
>> doesn't have to collect it :)

>
> <<Scratch head?>>

Ahh good point. So instead of Ruby's typical ?? bits + 20 + 20 per
key value pair, you get 2 +4+4 = 10. Of course, even in C land that's
about as small as it's gonna get for a hash.
Yeah you had it right.
-rp

Reply all
Reply to author
Forward
0 new messages