How to attach a data folder when running dumbo

15 views
Skip to first unread message

Chao Peng

unread,
Apr 3, 2013, 5:38:57 AM4/3/13
to dumbo...@googlegroups.com
Hi,

I'm trying to run a job, which need a folder, there are some jars and some data in it.
what should I do?
Actually, I'm using JPype to access stanford NLP package.
I have to add stanford folder and don't want to change its directory structure.

thanks

Chao Peng

 

Klaas Bosteels

unread,
Apr 4, 2013, 1:12:38 PM4/4/13
to dumbo...@googlegroups.com
Hey Chao,

Think you can't really avoid having a -libjar or -file option for each, but you could put some code in your starter that goes through the dir and adds the opts automatically. Or you could also zip the dir and unzip in your mapper/reducer's constructor I guess...

-K




Chao Peng

 

--
You received this message because you are subscribed to the Google Groups "dumbo-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dumbo-user+...@googlegroups.com.
To post to this group, send email to dumbo...@googlegroups.com.
Visit this group at http://groups.google.com/group/dumbo-user?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Chao Peng

unread,
Apr 5, 2013, 8:39:58 AM4/5/13
to dumbo...@googlegroups.com
Thanks Klaas,

I'm not trying to avoid having -libjar or -file, the problem is neither of them can work. I cannot find the jar files from mapreduce. And I'm kind of a starter of dumbo, how can I unzip in MR's constructor?
I have try to print os.walks('../../') in mapper, it returns nothing, how could that happen?

I try to put the jar files under ~/jars in every datanode, however it cannot find them
I try to put the jar files in hdfs, cannot find neither.

yield '', os.path.exists("/user/es/data/log/search.log")
yield '', os.path.exists("/home/es/search.log")

both "false" as the output

I guess I cannot attach the jar file and find the path of it.
Reply all
Reply to author
Forward
0 new messages