[virgil] r152 committed - Created wiki page through web user interface.

1 view
Skip to first unread message

virgil.apach...@codespot.com

unread,
Jan 24, 2012, 7:46:58 AM1/24/12
to virgil...@gmail.com
Revision: 152
Author: bone...@gmail.com
Date: Tue Jan 24 04:46:36 2012
Log: Created wiki page through web user interface.
http://code.google.com/a/apache-extras.org/p/virgil/source/detail?r=152

Added:
/wiki/hadoop.wiki

=======================================
--- /dev/null
+++ /wiki/hadoop.wiki Tue Jan 24 04:46:36 2012
@@ -0,0 +1,28 @@
+#summary Remote hadoop job deployment
+
+= Introduction =
+
+Virgil allows you to deploy Hadoop jobs to a remote cluster. This page
describes how.
+
+= Details =
+
+Similar to the different run-modes for Cassandra, Virgil supports two
different modes for Hadoop. When started with "bin/virgil", Hadoop jobs
are run locally within the Virgil JVM. This is convenient for testing, but
when executing against a large dataset, the jobs should be deployed to a
remote cluster.
+
+To deploy jobs to a remote cluster, edit the configuration in
$VIRGIL_HOME/mapreduce/conf. These are the exact three files found in
$HADOOP_HOME/conf. If Virgil and Hadoop are running on the same machine,
then you can simply symlink to the files.
+
+After changing the configuration, start Virgil with:
+
+{{{
+bin/virgil-hadoop -host $CASSANDRA_HOST
+}}}
+
+This version of the shell script, puts the HADOOP configuration on
Virgil's classpath. Hadoop reads the configuration off of the classpath,
and will remotely deploy the job to the cluster described by those
configuration files.
+
+To test your setup, the Virgil release now includes an example in
$VIRGIL_HOME/example. Within that directory, you should be able to run the
following:
+
+{{{
+./insert-data.sh
+./run-mapreduce.sh
+}}}
+
+That runs the example described on the

Reply all
Reply to author
Forward
0 new messages