Serialization of internal job vals

14 views
Skip to first unread message

Kostya Salomatin

unread,
Sep 22, 2016, 12:04:48 PM9/22/16
to Scalding Development
Hey folks,

I've got a question about serialization of internal Job vals (not the values passed in the pipe). Consider this simple example:

class MyJob(args: Args) extends Job(args) {
   val myFilter = new MyFilterClass(...)

   val pipe = ... //read statement
      .filter (myFilter.apply)
}

class MyFilterClass(some data required for filtering) { ... }

My understanding is that myFilter will only be initialized once and then serialized and passed to all subsequent workers and if scalding does not know how to serialize this object the job will fail, is this correct? MyFilterClass is a regular scala class (not a case class) and I did not do anything special to make it serializable and the job runs fine. Does scalding know by default how to serialize all scala objects without any effort from me in the code or does it silently call constructor every time?

Thanks,
Kostya

Oscar Boykin

unread,
Sep 22, 2016, 12:29:02 PM9/22/16
to Kostya Salomatin, Scalding Development
It uses this code:


which uses Kryo and Java Serialization to make the best effort to serialize functions.

--
You received this message because you are subscribed to the Google Groups "Scalding Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scalding-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages