My cascading program sends 30 MB files as output

21 views
Skip to first unread message

Srinivas Pachari

unread,
Aug 11, 2017, 9:58:42 AM8/11/17
to cascading-user

I have a cascading job that outputs 30 25MB files. Is there anyway I can reduce it to 128 mb files each. I tried -Dmapreduce.job.reduces=1. It does not seem to work. I set the number of reducers to 1 and that did not work. Any guidance would be helpful

Chris K Wensel

unread,
Aug 11, 2017, 12:56:25 PM8/11/17
to cascadi...@googlegroups.com
If there are no reducers in the final MR job, then that property will be ignored.

you may need to introduce a final GroupBy at the end of the pipeline to force a reducer.

ckw

On Aug 11, 2017, at 6:58 AM, Srinivas Pachari <sriniv...@gmail.com> wrote:

I have a cascading job that outputs 30 25MB files. Is there anyway I can reduce it to 128 mb files each. I tried -Dmapreduce.job.reduces=1. It does not seem to work. I set the number of reducers to 1 and that did not work. Any guidance would be helpful


--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at https://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/eae184b9-4aca-49b4-9424-887604d7f8cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Srinivas Pachari

unread,
Aug 24, 2017, 3:03:54 AM8/24/17
to cascading-user
Hello Chris,

I have tried that and it does provide 1 file at that step.

But when the sink job is executed, each split seems to be directly written to a file. The sink job also looks like a map reduce job. Is there a way to configure the sink job's reducers?

Regards,

Srinivas

Chris K Wensel

unread,
Aug 24, 2017, 11:26:11 AM8/24/17
to cascadi...@googlegroups.com
you will need to provide a code sample in order to understand what you are attempting.

ckw


For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages