Does Scalding/Cascading optimize multiple independent joins on the same dataset?

52 shikime
Kalo te mesazhi i parë i palexuar

Nikita

e palexuar
19 maj 2015, 8:54:21 e pasdites19.5.15
për cascadi...@googlegroups.com
I have three typed pipes A, B and C. I'd like to join B with A and C with A. Will scalding/cascading automatically optimize B.join(A.group) and C.join(A.group) to avoid grouping A twice?

Thanks,
Nikita

Oscar Boykin

e palexuar
19 maj 2015, 9:04:15 e pasdites19.5.15
për cascadi...@googlegroups.com
If you do B.join(A) (no need to write A.group) and C.join(A) that will be two independent map-reduce jobs.

A.group does not do anything unless you do something with the result.

moreover, A.join(B).join(C) is optimized into still just one mapreduce job (if you never go back to typed-pipe and just keep calling methods on CoGrouped).


--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/05f643bd-72e1-425a-9ae5-93e59983d5fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Oscar Boykin :: @posco :: http://twitter.com/posco

Nikita Lytkin

e palexuar
19 maj 2015, 10:23:35 e pasdites19.5.15
për cascadi...@googlegroups.com
Thanks, Oscar.
Përgjigju të gjithëve
Përgjigjju autorit
Transfero
0 mesazhe të reja