Running Jaql queries against a remote Hadoop cluster

85 views
Skip to first unread message

Beaker

unread,
Sep 27, 2010, 9:01:46 AM9/27/10
to Jaql Users
Is it possible to override options such as fs.default.name? Basically,
I have a pseudo-distributed cluster running on my machine, and am able
to run Jaql queries against this cluster. However, I want to run Jaql
queries against a remote cluster. How can I do this?

vuk.ercegovac

unread,
Sep 27, 2010, 1:36:44 PM9/27/10
to Jaql Users

Set HADOOP_HOME and HADOOP_CONF_DIR to point to the right place for
your cluster.
The files in HADOOP_CONF_DIR are used to override parameters needed to
connect to remote clusters.
Use the --cluster flag for jaql's shell.

I always do some sanity tests to make sure my current jaql client
session can see the cluster.
For hdfs:

jaql> hdfsShell("-ls ."); // do you see the files that you expect
on the cluster?

For mapreduce:

jaql> [1,2,3] -> write(hdfs("foo"));
jaql> read(hdfs("foo")) -> transform $ + 1; // do you see this
job on the map-reduce admin console?

Also, its always useful to try some bin/hadoop commands to make sure
that the hadoop cluster is visible from your environment.
Reply all
Reply to author
Forward
0 new messages