Greetings to all,
I have been trying to work with Hadoop Streaming without and with
Wukong. In both cases I have not found an easy way to make Ruby gems
available for requiring inside a mapper script.
One option is to put the files of all the lib directories of the
source of all the gems as zip archives in the HDFS. But that is error
prone and has problems with complex dependencies.
Any idea on how to solve this easily? It would be nice that Wukong
handled this gracefully, maybe reading a Gemfile from bundler and
uploading the proper files to the HDFS. What do you think? It may be
that I am just not using Hadoop properly and I could avoid these
problems, please tell me if that is the case :-)
Apart from that, it would be nice to fix the path of the Hadoop
streaming jar file in the Hadoop version 0.21, see this pull request
for details and minor testing:
https://github.com/mrflip/wukong/pull/4
Thanks a lot :-)
Álvaro Martín Fraguas