You should only ever have one dumbo.run() call and it's normal for
dumbo scripts to kind of get executed twice yeah (there are reasons
for this but you shouldn't have to worry about those as a dumbo user).
Instead, you should use a runner if you want to have multiple
job.additer(TestMapper(1), opts = [("output", "test1")])
job.additer(TestMapper(2), opts = [("output", "test2")])
def __init__(self, _val):
sys.stderr.write("Executing __init__" + str(_val) + "\n")
self.val = _val
def __call__(self, key, val):
yield key, self.val #using init val, not MR input val
if __name__ == "__main__":
Hope this helps,
> You received this message because you are subscribed to the Google Groups "dumbo-user" group.
> To post to this group, send email to dumbo...@googlegroups.com.
> To unsubscribe from this group, send email to dumbo-user+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/dumbo-user?hl=en.
i = 0
clone = prog.clone()
clone.addopt("param", "iteration=" + str(i))
pass # or set general opts or so
Please blog about it if it works, since this is completely
undocumented functionality as far as I know :)
dumbo.main(runner, starter, variator)