setup function

18 views
Skip to first unread message

Pavel Hančar

unread,
May 14, 2013, 9:09:38 AM5/14/13
to disc...@googlegroups.com
 Hello,
Hadoop has a setup method called once on the slave before many calls of map function start. Is there something like it in disco?
 Thanks,
 Pavel Hančar

Prashanth Mundkur

unread,
May 14, 2013, 9:21:56 AM5/14/13
to disc...@googlegroups.com
On 15:09 Tue 14 May, Pavel Hančar wrote:
> Hello,
> Hadoop has a setup
> method<http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Mapper.html#setup%28org.apache.hadoop.mapreduce.Mapper.Context%29>called
> once on the slave before many calls of map function start. Is there
> something like it in disco?

Yes, you can use the map_init function:
https://disco.readthedocs.org/en/latest/lib/worker/classic.html#disco.worker.classic.worker.Worker

--prashanth

Pavel Hančar

unread,
May 14, 2013, 10:40:11 AM5/14/13
to disc...@googlegroups.com
Thanks,
but can I pass a variable from map_init to map? I tried to modify params argument, but it doesn't work.
Pavel


2013/5/14 Prashanth Mundkur <prashant...@gmail.com>

--prashanth

--
You received this message because you are subscribed to the Google Groups "Disco-development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to disco-dev+...@googlegroups.com.
To post to this group, send email to disc...@googlegroups.com.
Visit this group at http://groups.google.com/group/disco-dev?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.



Prashanth Mundkur

unread,
May 14, 2013, 11:18:46 AM5/14/13
to disc...@googlegroups.com
On 16:40 Tue 14 May, Pavel Hančar wrote:
> Thanks,
> but can I pass a variable from map_init to map? I tried to modify params
> argument, but it doesn't work.

Right, you might need to use a global variable :P

The next version of Disco has a better API to support precisely this.
If you want to try a beta version of it, it's in the branch
'devel/scheduler-lib' at github.com/pmundkur/disco. Look at the
'init' function in
https://github.com/pmundkur/disco/blob/devel/scheduler-lib/lib/disco/worker/pipeline/worker.py

Feedback on this API would be much appreciated, especially before it
gets fixed into a formal Disco release.

--prashanth
Reply all
Reply to author
Forward
0 new messages