How to configure hadoop.home when using different Hadoop distros?

34 views
Skip to first unread message

Joe

unread,
Jan 20, 2016, 4:10:29 PM1/20/16
to genie
Hey there,

This may be a stupid question.  I apologize in advance if so.

I am configuring genie.properties and trying to figure out what to plug in for values like com.netflix.genie.server.hadoop.home.   I am using Genie with Hortonworks (for now), which differs slightly from Cloudera vs EMR vs whatever in packages and their placement.  Hopefully some/all of this will streamline as BigTop gains momentum.

In any case, it's not clear from the documentation what constitutes a "hadoop home" for you guys.  Where the configs live?  Where the binaries live?  What needs to be in com.netflix.genie.server.hadoop.home?

Here is what I mean.  Installing the Hortonworks binaries, hadoop unpacks into the following directories:

[root@localhost hadoop]# find / -name hadoop
/etc/hadoop
/etc/default/hadoop
/etc/bash_completion.d/hadoop
/tmp/kitchen/cookbooks/hadoop
/tmp/kitchen/cache/cookbooks/hadoop
/usr/share/doc/pig-0.12.1.2.1.15.0-946/api/org/apache/pig/backend/hadoop
/usr/share/doc/pig-0.12.1.2.1.15.0-946/api/org/apache/hadoop
/usr/bin/hadoop
/usr/lib/hadoop-yarn/etc/hadoop
/usr/lib/hadoop
/usr/lib/hadoop/bin/hadoop
/usr/lib/hadoop/etc/hadoop


There seems to be some confusion on this across the net anyways.  http://stackoverflow.com/questions/28310727/how-to-find-hadoop-home-path-on-linux points to the directory with all the config files.  core-site.xml, etc.

I'm guessing we want /usr/lib/hadoop for Genie using HDP binaries?   Contents:

[root@localhost hadoop]# ls /usr/lib/hadoop
bin                                        hadoop-common-2.4.0.2.1.15.0-946-tests.jar
client                                     hadoop-common.jar
etc                                        hadoop-nfs-2.4.0.2.1.15.0-946.jar
hadoop-annotations-2.4.0.2.1.15.0-946.jar  hadoop-nfs.jar
hadoop-annotations.jar                     lib
hadoop-auth-2.4.0.2.1.15.0-946.jar         libexec
hadoop-auth.jar                            sbin
hadoop-common-2.4.0.2.1.15.0-946.jar


Regards,
Joe Reid

Tom Gianos

unread,
Jan 20, 2016, 5:05:41 PM1/20/16
to genie
Hi Joe,

For us it is the root of our Hadoop client directory which contains the following:

```
drwxr-xr-x 8 {user} {group} 4096 Oct 27 19:44 ./
drwxr-xr-x 3 {user} {group} 4096 Oct 27 23:04 ../
drwxr-xr-x 2 {user} {group} 4096 Oct 27 19:44 bin/
drwxr-xr-x 2 {user} {group} 4096 Oct 27 19:44 conf/
drwxr-xr-x 3 {user} {group} 4096 Oct 27 19:44 lib/
drwxr-xr-x 2 {user} {group} 4096 Oct 27 19:44 libexec/
drwxr-xr-x 4 {user} {group} 4096 Oct 27 19:43 share/
drwxr-xr-x 3 {user} {group} 4096 Oct 27 19:44 usr/
```

Similar to where you would set $HADOOP_HOME and then add $HADOOP_HOME/bin to your path if you wanted to be able to say "hadoop ..."

If you need concrete examples you can look at the docker image construction or final result:
Dockerfile with pertinent line highlighted (you can also see what is unzipped that location later in the file)

Tom
Reply all
Reply to author
Forward
0 new messages