HelioSearch - Actual Unique vs. Estimated Unique

瀏覽次數:21 次
跳到第一則未讀訊息

Terrance Snyder

未讀,
2014年12月14日 下午5:20:592014/12/14
收件者:helio...@googlegroups.com
I know this has come up a few times - but to support massively large datasets for uniques (this user id unique, session uniques etc) it would be very nice to include the stream-lib or similar library for estimated cardinality sketching. Something like HLL, HLL+, or similar so that we dont have Set<Object>


I don't mind adding this myself and putting a patch request - but I'd rather leave both options to compute 100% correct unique vs 98% correct unique

HLL background

Some libraries

Yonik, can you let me know (PM me if you want) what guidance you can give when adding something like this?
回覆所有人
回覆作者
轉寄
0 則新訊息