Porting to C++

221 views
Skip to first unread message

Seth Hall

unread,
May 10, 2012, 9:41:58 AM5/10/12
to stream-lib-user
We are starting to look into porting stream-lib to C++ soon for use in
Bro-IDS[1] (a domain specific language and runtime for network traffic
analysis) but there is one thing I am particularly curious about. I
saw in one of the recent articles that mentioned stream-lib that the
author said that loglog data structures aren't mergeable, but then I
noticed some discussion on that point in the comments.

What's the final verdict? Are LogLog and HyperLogLog structures
mergable and do they maintain the confidence rating?

I'm asking because most Bro-IDS deployments are clusters now so any
metrics collection is happening across multiple processes and we need
to be able to merge the values from these individual processes into
one representative value for the entire network. Think something like
the number of unique IP addresses that a particular host communicates
with.

Thanks for creating stream-lib! If everything we're hoping to do
works out there are going to a be lot of really happy network security
teams using your work.

.Seth

1. http://www.bro-ids.org/

Eugene Kirpichov

unread,
May 17, 2012, 12:33:38 PM5/17/12
to stream-...@googlegroups.com
Hi Seth,

LogLog and HyperLogLog definitely *are* mergeable, and very easily so - by max'ing their corresponding table entries. 
The merge operation is currently implemented:

четверг, 10 мая 2012 г., 17:41:58 UTC+4 пользователь Seth Hall написал:

Josh Ferguson

unread,
May 17, 2012, 2:28:38 PM5/17/12
to stream-...@googlegroups.com
Taking the max of each register is basically equivalent to counting the cardinality of the intersection. You can use that and the fact that

|A UNION B| = |A| + |B| - |A AND B|

to calculate the union. I emailed the paper authors and they said they got bounds that were very good using these methods, although I'm not sure that the error rates are proven mathematically. I did some tests and also got very very good results.

Josh

Seth Hall

unread,
May 17, 2012, 3:44:52 PM5/17/12
to stream-...@googlegroups.com

On May 17, 2012, at 2:28 PM, Josh Ferguson wrote:

> I emailed the paper authors and they said they got bounds that were very good using these methods, although I'm not sure that the error rates are proven mathematically. I did some tests and also got very very good results.

Sounds great! Thanks guys.

.Seth

Matt Abrams

unread,
Apr 9, 2013, 8:01:41 AM4/9/13
to stream-...@googlegroups.com
John -

As far as I'm aware no one has, at least publicly, ported stream-lib to C++.  

Matt


On Mon, Apr 8, 2013 at 10:20 PM, John Regehr <john....@gmail.com> wrote:
Has anything happened with the C++ port of stream-lib? If not, I'm ready to port over the hyperloglog part. Thanks,

John Regehr


On Thursday, May 10, 2012 7:41:58 AM UTC-6, Seth Hall wrote:
We are starting to look into porting stream-lib to C++ soon for use in
Bro-IDS[1] (a domain specific language and runtime for network traffic

--
You received this message because you are subscribed to the Google Groups "stream-lib-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stream-lib-us...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Moiz Arafat

unread,
Sep 18, 2013, 8:56:43 AM9/18/13
to stream-...@googlegroups.com
Any updates on porting stream lib in C++

-moiz

Kelly Sommers

unread,
Sep 18, 2013, 9:56:02 AM9/18/13
to stream-...@googlegroups.com
I would prefer C, or at least extern C wrapping functions so that I could reuse it from more languages :)

Just my 2c!


On Mon, Apr 8, 2013 at 10:20 PM, John Regehr <john....@gmail.com> wrote:
Has anything happened with the C++ port of stream-lib? If not, I'm ready to port over the hyperloglog part. Thanks,

John Regehr

On Thursday, May 10, 2012 7:41:58 AM UTC-6, Seth Hall wrote:
We are starting to look into porting stream-lib to C++ soon for use in
Bro-IDS[1] (a domain specific language and runtime for network traffic

Matt Abrams

unread,
Sep 20, 2013, 4:25:21 PM9/20/13
to stream-...@googlegroups.com
I don't have time to do the port myself but I'm happy to help anyone
who'd like to give it a shot.

Matt

seth...@gmail.com

unread,
Sep 30, 2013, 4:50:24 PM9/30/13
to stream-...@googlegroups.com

On Apr 8, 2013, at 10:20 PM, John Regehr <john....@gmail.com> wrote:

> Has anything happened with the C++ port of stream-lib? If not, I'm ready to port over the hyperloglog part. Thanks,

We didn't end up porting streamlib directly, but we're implementing many of the same data structures in Bro now (in C++). And FYI, CardinalityCounter.cc is HyperLogLog.

https://github.com/bro/bro/tree/master/src/probabilistic

.Seth
Reply all
Reply to author
Forward
0 new messages