Keys sorted on Reducers

7 views
Skip to first unread message

Russell Carden

unread,
Jan 14, 2019, 5:49:17 PM1/14/19
to Scalding Development
I do a sort by grouping on a key and then doing a sortby on the values.  I then write out the results.   Since I have multiple reducers, I will have multiple files.   I have observed that in the output for each reducer, the keys are also sorted.  That is not only are the values within each group sorted, the groups themselves are sorted in each file.   Is this a consequence of how the results for each group are merged within a reducer?

Oscar Boykin

unread,
Jan 14, 2019, 6:12:37 PM1/14/19
to Russell Carden, Scalding Development
This is a consequence of how map-reduce works. It does hash partitioning, but then on each reducer the keys are sorted before it begins the reducing. It does not do hash partitioning *within* reducers.

On Mon, Jan 14, 2019 at 12:49 PM Russell Carden <ruca...@gmail.com> wrote:
I do a sort by grouping on a key and then doing a sortby on the values.  I then write out the results.   Since I have multiple reducers, I will have multiple files.   I have observed that in the output for each reducer, the keys are also sorted.  That is not only are the values within each group sorted, the groups themselves are sorted in each file.   Is this a consequence of how the results for each group are merged within a reducer?

--
You received this message because you are subscribed to the Google Groups "Scalding Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scalding-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Russell Carden

unread,
Jan 15, 2019, 9:13:13 AM1/15/19
to Scalding Development
Thank you.  I see it on the wikipedia page for map reduce.  
Reply all
Reply to author
Forward
0 new messages