is something like a combiner available

23 views
Skip to first unread message

Roland Gude

unread,
Jan 11, 2012, 7:54:44 AM1/11/12
to peregrine...@googlegroups.com
hadoop offers combiners to aggregate values between map and reduce.

peregrine Job class has a getter and setter for Combiner, but they are never used.
are they placeholders for a similar feature?

Kevin Burton

unread,
Jan 11, 2012, 1:39:58 PM1/11/12
to peregrine...@googlegroups.com
That's what I'm actually implementing now.  I"m using a branch named burton-combiner-support for the implementation.

It should be ready in a week or two.

The combiner isn't as high a priority because you don't need it to get a basic MR impl up and running and because we use direct shuffling we have a LOT more throughput and flexibility.

That said, it looks like there's about a 2x reduction in throughput in some situations so it's worth implementing.

Further, the network may actually be used for OTHER tasks so simply wasting bandwidth isn't really a good idea.

Kevin
--
--

Founder/CEO Spinn3r.com

Location: San Francisco, CA
Skype: burtonator

Skype-in: (415) 871-0687


Roland Gude

unread,
Mar 28, 2012, 2:20:01 PM3/28/12
to peregrine...@googlegroups.com
Any news regardiung this?

burtonator

unread,
Mar 28, 2012, 2:48:12 PM3/28/12
to peregrine...@googlegroups.com
I got sort of pulled into a rabbit hole with another but / feature in peregrine that I wanted implemented.

It turns out that I was using mlock incorrectly so in order to tighten things down with memory I researched using setrlimit and RLIMIT_MEMLOCK so that the VM would dump core if I ever implemented a memory leak again.

The kernel has a nasty bug where it actually just LOCKED hard which impacted us in production.

... so I prioritized fixing this.

But it turns out that it setrlimit and RLIMIT_MEMLOCK requires the process to run as non-root so I implemented peregrine as setuid and that caused a whole issue with permissions and tracking down those bugs.

Now I think I have one more bug which I *think* I have tracked down but I want to rewrite some tests to make sure.

The general thinking is that it's better to have a STABLE but slow runtime than a FAST runtime that crashes and locks up some times.

Peregrine would get a bad reputation this way.

Also, the branch I'm working on has a lot of other nice features I implemented.

Ironically, I'm not expecting the combiner to be difficult to implement.

Kevin

Roland Gude

unread,
Mar 29, 2012, 3:07:51 AM3/29/12
to peregrine...@googlegroups.com
Ok,
stabilizing instead of new features seems to be a really good descission.

Point is, currently i have convinced my bosses to spend resources (me) in peregrine. Mostly i am trying to evaluate it for our use cases. And i am impressed with most of the results. there are however results where we simply cannot work without a combiner. I am starting to work on it now, but first i need to figure out how it is supposed to work. If you could give me a brief design description of what you had in mind that would be of great help.

Roland Gude

unread,
Apr 3, 2012, 9:23:07 AM4/3/12
to peregrine...@googlegroups.com
i think got it working
i'll upload the patches soon

burtonator

unread,
Apr 20, 2012, 4:36:50 PM4/20/12
to peregrine...@googlegroups.com
did you upload anything...? I have the combiner design already in my head ... I just wanted to get it working before implementing optimizations.

The next step now is to put it into production for LARGE jobs... like 1TB jobs..  then once that is working I can concentrate more on the optimization front.

Also.. .the arch is stabilizing a lot more so that's good.

Roland Gude

unread,
Apr 23, 2012, 4:03:17 AM4/23/12
to peregrine...@googlegroups.com
no i did not. i found a severe issue in memory usage. and had to work on a different project for awhile so i could not fix it yet.

I did it btw with a new I/O driver with schema consume, which buffers incoming stuff and then uses combinerunner with a shuffle output to do the work once the buffer is full.
I'm not entirely sure about this design as well, but it seems to work (apart from the bug)

burtonator

unread,
May 1, 2012, 7:26:05 PM5/1/12
to peregrine...@googlegroups.com
Oh nice... that's cool.  Guess the API is frozen :)

I'm still running into a few bugs at scale... one of which is trying to figure out why a map job is never finishing.  

But I think I may have found it... 

Kevin

Roland Gude

unread,
May 2, 2012, 6:34:36 AM5/2/12
to peregrine...@googlegroups.com
I actually got the memory issue fixed.
 tomorrow i will push the patches on my patch-queue (i am out of office today)

It introduces a dependency to commons-pool however (in order to reuse large ChannelBuffers - which fixes the memory issue)

Roland Gude

unread,
May 3, 2012, 5:34:22 AM5/3/12
to peregrine...@googlegroups.com
patches are available at bitbucket.org/rjtg/peregrine-patches

relevant patches are combineiodriver.diff and channelbufferpool.diff

burtonator

unread,
May 4, 2012, 3:45:13 PM5/4/12
to peregrine...@googlegroups.com
It may be easier for you to just clone and push some where so that we can land actual branches.

I prefer not to land monolithic patches because we lose the commit history.

But it's better than not having it right now :)

Kevin
Reply all
Reply to author
Forward
0 new messages