Disco Master Memory Usage on Large Map/Reduce

35 views
Skip to first unread message

Giles Brown

unread,
Dec 3, 2015, 1:52:08 PM12/3/15
to Disco-development
I have an 18x16CPU Disco cluster and I am trying to run a map/reduce job where the results of the map stage are fairly large.  What I see is that the memory used by the the disco master goes up and up after starting the reduce phase.  It eventually rises to a level where it get's killed off by the OOM killer and the job fails.

I'm not sure if this is just me asking to much of the disco cluster or if there is some kind of bug.

Thoughts?

Giles

Erik Dubbelboer

unread,
Dec 4, 2015, 4:35:49 AM12/4/15
to Disco-development
There are two things you can do:

1) Write a combiner function. This function will be almost the same as your reduce function. The output of your map function is send to this combine function on the same node before it's result is send to another node for the reduce stage. So using a combine function can greatly reduce the amount of data send between nodes.

2) Use more partitions. Search for partitions on http://disco.readthedocs.org/en/latest/lib/worker/classic.html The reduce function is run on N partitions (defaults to 1). After this the result of these reduce partitions are combined and sorted using the unix sort function. Of course this will only work if your result contains many rows. If your reduce function reduces your data to only a couple of rows partitions won't help, but you'll also won't need as much memory if this was the case.

Giles Brown

unread,
Apr 16, 2016, 10:57:31 PM4/16/16
to Disco-development
Hi Erik,
It was a long time since you wrote your response, but I'm going to write an update here of what I did in case it helps anyone else.

The combiner approach was not a solution because essentially the operation I am performing is a huge partitioning of the data with no combining or reduction.

I was already using partitions.  I use a pipeline worker rather than a classic worker, but I presume that this doesn't make a big difference here.

The problem I ran into with partitions is that each partition requires a new file and I was quickly hitting the "too many open files".  I could probably have worked around this by raising the limits on each of the nodes.  Instead I decided that I would simply divide the work into multiple passes over the same dataset.  The job(s) is now working just fine.

Cheers,
Giles
Reply all
Reply to author
Forward
0 new messages