HelioSearch - Actual Unique vs. Estimated Unique

21 views
Skip to first unread message

Terrance Snyder

unread,
Dec 14, 2014, 5:20:59 PM12/14/14
to helio...@googlegroups.com
I know this has come up a few times - but to support massively large datasets for uniques (this user id unique, session uniques etc) it would be very nice to include the stream-lib or similar library for estimated cardinality sketching. Something like HLL, HLL+, or similar so that we dont have Set<Object>


I don't mind adding this myself and putting a patch request - but I'd rather leave both options to compute 100% correct unique vs 98% correct unique

HLL background

Some libraries

Yonik, can you let me know (PM me if you want) what guidance you can give when adding something like this?
Reply all
Reply to author
Forward
0 new messages