Does toTypedPipe call break some scalding optimization?

31 views
Skip to first unread message

Kostya Salomatin

unread,
Oct 12, 2016, 3:05:59 PM10/12/16
to Scalding Development
Hi scalding experts,

I remember that at some point of time someone told me not to use .toTypedPipe call because it can break some optimization. That was in twitter scalding-users group, so I can't look it up now. I think the scenario was the following:

   pipe.group.toTypedPipe.map

My understanding of this issue is very fuzzy, can someone give more context and implications of using .toTypedPipe call?

The reason I'm asking, the code above compiles just fine without toTypedPipe call, but IntelliJ loses the typing information which makes it harder to code, so I end up using .toTypedPipe calls, but then remove them after the code is complete. Maybe there is some trick to make IntelliJ understand this implicit conversion?

Thanks,
Kostya

Oscar Boykin

unread,
Oct 12, 2016, 3:51:13 PM10/12/16
to Kostya Salomatin, Scalding Development
This is a *may* happen situation. In most cases you are fine.

When this can happen is:

pipe.group.sum.map { case (k, v) => (k, v*v) }.join(other)

in this case, since .map is not a method on Grouped, an implicit .toTypedPipe is inserted. But in this case you could have done:

pipe.group.sum.mapValues { v => v*v }.join(other)

in the latter case, since scalding can see the keys did not change, it can do everything in 1 map-reduce job. But in the former case, since it can't look inside functions, it can't see that the .map didn't change the keys.

So, it is always better to use the most constrained function you can (mapValues vs map, filter vs flatMap, etc...) and it is generally better if you want to compose in joins to avoid calling .toTypedPipe.

Since .toTypedPipe is implicit, you should never (or very rarely) need to explicitly call it.

--
You received this message because you are subscribed to the Google Groups "Scalding Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scalding-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kostya Salomatin

unread,
Oct 12, 2016, 4:42:48 PM10/12/16
to Oscar Boykin, Scalding Development
Thanks, that makes sense.

To unsubscribe from this group and stop receiving emails from it, send an email to scalding-dev+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Scalding Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scalding-dev+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Konstantin                              mailto:salo...@gmail.com
Reply all
Reply to author
Forward
0 new messages