We are starting to look into porting stream-lib to C++ soon for use in
Bro-IDS[1] (a domain specific language and runtime for network traffic
analysis) but there is one thing I am particularly curious about. I
saw in one of the recent articles that mentioned stream-lib that the
author said that loglog data structures aren't mergeable, but then I
noticed some discussion on that point in the comments.
What's the final verdict? Are LogLog and HyperLogLog structures
mergable and do they maintain the confidence rating?
I'm asking because most Bro-IDS deployments are clusters now so any
metrics collection is happening across multiple processes and we need
to be able to merge the values from these individual processes into
one representative value for the entire network. Think something like
the number of unique IP addresses that a particular host communicates
with.
Thanks for creating stream-lib! If everything we're hoping to do
works out there are going to a be lot of really happy network security
teams using your work.
.Seth
1.
http://www.bro-ids.org/