HyperLogLogPlus intersection

147 views
Skip to first unread message

Oren Ellenbogen

unread,
Apr 23, 2013, 7:39:46 AM4/23/13
to stream-...@googlegroups.com
Hey,

I'm using HyperLogLogPlus (p=14, sp=25) and I was wondering how to implement intersection between 2 HLL++ instances?
My assumption is that both HLL contain small cardinality per file (less than 10,000), with intersection of > 5%, so it should be relatively accurate (based on http://blog.aggregateknowledge.com/2012/12/17/hll-intersections-2/).


Is there a reason why it's currently not part of HLL++? is there some code snippet available so I could create my own version in HLL++?


Thanks,

Matt Abrams

unread,
Apr 23, 2013, 10:43:51 AM4/23/13
to stream-...@googlegroups.com
Oren -

If you know that your instances are good candidates for intersection calculation then you can do:

H-intersect = |A| + |B| - |A U B|


We didn't offer the method as part of the interface because it does not work in all cases.

Matt




--
You received this message because you are subscribed to the Google Groups "stream-lib-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stream-lib-us...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Reply all
Reply to author
Forward
0 new messages