Unable to get scoobi Configuration Arguments to work (New User)

35 views
Skip to first unread message

Deepak Jain

unread,
Sep 29, 2014, 1:04:06 AM9/29/14
to scoob...@googlegroups.com
I am starting to learn more about Scoobi. I found couple of issues 
1.  Unable to run scoobi my-app example on actual cluster.

I have a 2 node hadoop cluster setup. From hadoop cli machine, when i run hadoop fs -ls / , i see the list of directories in HDFS. I have HADOOP_CONF directory pointing to hadoop conf. It runs locally and never on the cluster.

Logs
====


$ export HADOOP_CONF=/etc/hadoop/conf
[dvasthimal@invisio-365818 my-app]$ ls -l /etc/hadoop/conf
total 152
-rw-r--r-- 1 hdfs   hadoop 1744 Sep 26 02:55 capacity-scheduler.xml
-rw-r--r-- 1 hdfs   root   1021 Sep 26 02:58 commons-logging.properties
-rw-r--r-- 1 hdfs   hadoop 1335 Aug 27 20:14 configuration.xsl
-rw-r--r-- 1 root   root    318 Aug 27 20:14 container-executor.cfg
-rw-r--r-- 1 hdfs   hadoop 1989 Sep 26 02:58 core-site.xml
-rw-r--r-- 1 root   root    774 Aug 27 20:14 core-site.xml.rpmnew
..
..
..
..
dvasthimal@invisio-365818 my-app]$ rm -rf output.scoobie*; sbt "run-main WordCount -Dmapred.max.map.failures.percent=20 -Dmapred.max.reduce.failures.percent=20 input.txt output.scoobie.3 -- scoobi verbose"
[info] Set current project to MyApplication (in build file:/home/dvasthimal/my-app/)
[info] Running WordCount -Dmapred.max.map.failures.percent=20 -Dmapred.max.reduce.failures.percent=20 input.txt output.scoobie.3 -- scoobi verbose
[WARN] NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[INFO] deprecation - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
[INFO] deprecation - mapred.map.child.log.level is deprecated. Instead, use mapreduce.map.log.level
[INFO] deprecation - mapred.reduce.child.log.level is deprecated. Instead, use mapreduce.reduce.log.level
[INFO] deprecation - mapred.max.map.failures.percent is deprecated. Instead, use mapreduce.map.failures.maxpercent
[INFO] deprecation - mapred.max.reduce.failures.percent is deprecated. Instead, use mapreduce.reduce.failures.maxpercent
[INFO] deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
[INFO] HadoopMode - SCOOBI version: 0.8.5, commit: 7d0e802, timestamp: 08-07-2014 09:32:43 +1000
[INFO] Sink - Output path: output.scoobie.3
[INFO] Source - Input path: input.txt (1.63 KiB)
[INFO] HadoopMode - ======================================================================
[INFO] HadoopMode - ===== START OF SCOOBI JOB 'WordCount$-0928-215852-1929107525' ========
[INFO] HadoopMode - ======================================================================

[INFO] HadoopMode - Executing map reduce jobs
Mscr(1

  inputs: + GbkInputChannel(Load (1)[String] (source 1) )

          mappers 
          ParallelDo (17)[String,(String,Int),((Unit,Unit),Unit)] (bridge 7fca0) 

          last mappers 
          ParallelDo (17)[String,(String,Int),((Unit,Unit),Unit)] (bridge 7fca0) 

  outputs: + GbkOutputChannel(GroupByKey (18)[String,Int] (bridge 52d0e) , combiner = Combine (19)[String,Int] (bridge 93d66) [sinks: Some(output.scoobie.3)]))

[INFO] HadoopMode - ===== START OF MAP REDUCE JOB 1 of 1 (mscr id = 1) ======

[INFO] deprecation - mapred.jar is deprecated. Instead, use mapreduce.job.jar
[INFO] deprecation - mapred.output.value.groupfn.class is deprecated. Instead, use mapreduce.job.output.group.comparator.class
[INFO] deprecation - mapred.output.key.comparator.class is deprecated. Instead, use mapreduce.job.output.key.comparator.class
[INFO] deprecation - mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
[INFO] deprecation - mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
[INFO] deprecation - mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
[INFO] deprecation - mapreduce.partitioner.class is deprecated. Instead, use mapreduce.job.partitioner.class
[INFO] deprecation - mapred.job.name is deprecated. Instead, use mapreduce.job.name
[INFO] deprecation - mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
[INFO] deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
[INFO] deprecation - mapred.local.dir is deprecated. Instead, use mapreduce.cluster.local.dir
[INFO] deprecation - fs.checkpoint.edits.dir is deprecated. Instead, use dfs.namenode.checkpoint.edits.dir
[INFO] deprecation - dfs.data.dir is deprecated. Instead, use dfs.datanode.data.dir
[INFO] deprecation - fs.checkpoint.dir is deprecated. Instead, use dfs.namenode.checkpoint.dir
[INFO] deprecation - mapred.temp.dir is deprecated. Instead, use mapreduce.cluster.temp.dir
[INFO] deprecation - dfs.name.dir is deprecated. Instead, use dfs.namenode.name.dir
[INFO] deprecation - mapred.system.dir is deprecated. Instead, use mapreduce.jobtracker.system.dir
[INFO] deprecation - mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
[INFO] deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
[INFO] deprecation - mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
[INFO] deprecation - dfs.name.edits.dir is deprecated. Instead, use dfs.namenode.edits.dir
[INFO] MapReduceJob - Total input size: 1.63 KiB
[INFO] MapReduceJob - Number of reducers: 1
[INFO] deprecation - session.id is deprecated. Instead, use dfs.metrics.session-id
[INFO] JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
[INFO] deprecation - mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
[INFO] FileInputFormat - Total input paths to process : 1
[INFO] JobSubmitter - number of splits:1
[INFO] deprecation - user.name is deprecated. Instead, use mapreduce.job.user.name
[INFO] deprecation - mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
[INFO] deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
[INFO] deprecation - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
[INFO] deprecation - mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
[INFO] deprecation - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
[INFO] deprecation - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
[INFO] deprecation - mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
[INFO] deprecation - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
[INFO] JobSubmitter - Submitting tokens for job: job_local842016996_0001
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737052/7b100dac-bcb3-4c06-a4c8-020b2093f246 <- /home/dvasthimal/my-app/7b100dac-bcb3-4c06-a4c8-020b2093f246
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/env/7b100dac-bcb3-4c06-a4c8-020b2093f246 as file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737052/7b100dac-bcb3-4c06-a4c8-020b2093f246
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737053/scoobi.metadata.TK20 <- /home/dvasthimal/my-app/scoobi.metadata.TK20
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TK20 as file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737053/scoobi.metadata.TK20
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737054/scoobi.metadata.TV20 <- /home/dvasthimal/my-app/scoobi.metadata.TV20
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TV20 as file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737054/scoobi.metadata.TV20
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737055/scoobi.metadata.TP20 <- /home/dvasthimal/my-app/scoobi.metadata.TP20
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TP20 as file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737055/scoobi.metadata.TP20
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737056/scoobi.metadata.TG20 <- /home/dvasthimal/my-app/scoobi.metadata.TG20
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TG20 as file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737056/scoobi.metadata.TG20
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737057/scoobi.mappers-step_1_of_1 <- /home/dvasthimal/my-app/scoobi.mappers-step_1_of_1
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.mappers-step_1_of_1 as file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737057/scoobi.mappers-step_1_of_1
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737058/scoobi.combiners-step_1_of_1 <- /home/dvasthimal/my-app/scoobi.combiners-step_1_of_1
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.combiners-step_1_of_1 as file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737058/scoobi.combiners-step_1_of_1
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737059/scoobi.reducers-step_1_of_1 <- /home/dvasthimal/my-app/scoobi.reducers-step_1_of_1
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.reducers-step_1_of_1 as file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737059/scoobi.reducers-step_1_of_1
[INFO] deprecation - mapred.cache.localFiles is deprecated. Instead, use mapreduce.job.cache.local.files
[INFO] Job - The url to track the job: http://localhost:8080/
[INFO] LocalJobRunner - OutputCommitter set in config null
[INFO] MapReduceJob - MapReduce job 'job_local842016996_0001' submitted. Please see http://localhost:8080/ for more info.
[INFO] LocalJobRunner - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
[INFO] LocalJobRunner - Waiting for map tasks
[INFO] LocalJobRunner - Starting task: attempt_local842016996_0001_m_000000_0
[INFO] Task -  Using ResourceCalculatorProcessTree : [ ]
[INFO] MapTask - Processing split: file:/home/dvasthimal/my-app/input.txt:0+1664 (on channel:1)
[INFO] MapTask - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
[INFO] MapTask - (EQUATOR) 0 kvi 26214396(104857584)
[INFO] MapTask - mapreduce.task.io.sort.mb: 100
[INFO] MapTask - soft limit at 83886080
[INFO] MapTask - bufstart = 0; bufvoid = 104857600
[INFO] MapTask - kvstart = 26214396; length = 6553600
[INFO] MapTask - the URL of Java (evidenced with the java.lang.String class) is jar:file:/home/dvasthimal/invisio/jdk1.7.0_67/jre/lib/rt.jar!/java/lang/String.class
[INFO] MapTask - the URL of Scala (evidenced with the scala.collection.immutable.Range class) is jar:file:/home/dvasthimal/.sbt/boot/scala-2.10.3/lib/scala-library.jar!/scala/collection/immutable/Range.class
[INFO] MapTask - the URL of Hadoop (evidenced with the org.apache.hadoop.io.Writable class) is jar:file:/home/dvasthimal/.ivy2/cache/org.apache.hadoop/hadoop-common/jars/hadoop-common-2.2.0.jar!/org/apache/hadoop/io/Writable.class
[INFO] MapTask - the URL of Avro (evidenced with the org.apache.avro.Schema class) is jar:file:/home/dvasthimal/.ivy2/cache/org.apache.avro/avro/jars/avro-1.7.4.jar!/org/apache/avro/Schema.class
[INFO] MapTask - the URL of Kiama (evidenced with the org.kiama.rewriting.Rewriter class) is jar:file:/home/dvasthimal/.ivy2/cache/com.googlecode.kiama/kiama_2.10/jars/kiama_2.10-1.6.0.jar!/org/kiama/rewriting/Rewriter.class
[INFO] MapTask - the URL of Scoobi (evidenced with the com.nicta.scoobi.core.ScoobiConfiguration class) is jar:file:/home/dvasthimal/.ivy2/cache/com.nicta/scoobi_2.10/jars/scoobi_2.10-0.8.5.jar!/com/nicta/scoobi/core/ScoobiConfiguration.class
[INFO] deprecation - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.mappers-step_1_of_1 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737057/scoobi.mappers-step_1_of_1
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.mappers-step_1_of_1
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.mappers-step_1_of_1
[INFO] MapTask - Starting on invisio-365818
[INFO] MapTask - Input is file:/home/dvasthimal/my-app/input.txt:0+1664 (on channel:1)
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/env/7b100dac-bcb3-4c06-a4c8-020b2093f246 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737052/7b100dac-bcb3-4c06-a4c8-020b2093f246
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/env/7b100dac-bcb3-4c06-a4c8-020b2093f246
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/env/7b100dac-bcb3-4c06-a4c8-020b2093f246
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TK20 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737053/scoobi.metadata.TK20
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TK20
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TK20
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TV20 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737054/scoobi.metadata.TV20
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TV20
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TV20
[INFO] LocalJobRunner - 
[INFO] MapTask - Starting flush of map output
[INFO] MapTask - Spilling map output
[INFO] MapTask - bufstart = 0; bufend = 3183; bufvoid = 104857600
[INFO] MapTask - kvstart = 26214396(104857584); kvend = 26213524(104854096); length = 873/6553600
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.combiners-step_1_of_1 (memoise=true)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737058/scoobi.combiners-step_1_of_1
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.combiners-step_1_of_1
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.combiners-step_1_of_1
[INFO] MapTask - Finished spill 0
[INFO] Task - Task:attempt_local842016996_0001_m_000000_0 is done. And is in the process of committing
[INFO] LocalJobRunner - map
[INFO] Task - Task 'attempt_local842016996_0001_m_000000_0' done.
[INFO] LocalJobRunner - Finishing task: attempt_local842016996_0001_m_000000_0
[INFO] LocalJobRunner - Map task executor complete.
[INFO] MapReduceJob - Map 100%    Reduce   0%
[INFO] Task -  Using ResourceCalculatorProcessTree : [ ]
[INFO] Merger - Merging 1 sorted segments
[INFO] Merger - Down to the last merge-pass, with 1 segments left of total size: 2186 bytes
[INFO] LocalJobRunner - 
[INFO] deprecation - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.reducers-step_1_of_1 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737059/scoobi.reducers-step_1_of_1
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.reducers-step_1_of_1
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.reducers-step_1_of_1
[INFO] ReduceTask - Starting on invisio-365818
[INFO] OutputChannel - Outputs are Some(output.scoobie.3)
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TG20 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/step_1_of_1/1411966737056/scoobi.metadata.TG20
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TG20
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/dist-objs/scoobi.metadata.TG20
[INFO] deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
[INFO] Task - Task:attempt_local842016996_0001_r_000000_0 is done. And is in the process of committing
[INFO] LocalJobRunner - 
[INFO] Task - Task attempt_local842016996_0001_r_000000_0 is allowed to commit now
[INFO] FileOutputCommitter - Saved output of task 'attempt_local842016996_0001_r_000000_0' to file:/tmp/scoobi-dvasthimal/WordCount$-0928-215852-1929107525/tmp-out-step_1_of_1/_temporary/0/task_local842016996_0001_r_000000
[INFO] LocalJobRunner - reduce > reduce
[INFO] Task - Task 'attempt_local842016996_0001_r_000000_0' done.
[INFO] MapReduceJob - Map 100%    Reduce 100%
[INFO] HadoopMode - Map reduce job sinks:  Vector(TextFileSink: output.scoobie.3) 
[INFO] HadoopMode - ===== END OF MAP REDUCE JOB 1 of 1 (mscr id = 1, Scoobi job = WordCount$-0928-215852-1929107525) ======

[INFO] HadoopMode - ======================================================================
[INFO] HadoopMode - ===== END OF SCOOBI JOB 'WordCount$-0928-215852-1929107525'   ========
[INFO] HadoopMode - ======================================================================

Not interrupting system thread Thread[process reaper,10,system]
Not interrupting system thread Thread[process reaper,10,system]
Not interrupting system thread Thread[process reaper,10,system]
Not interrupting system thread Thread[process reaper,10,system]
[success] Total time: 8 s, completed Sep 28, 2014 9:58:58 PM
[dvasthimal@invisio-365818 my-app]$ 


2) Unable to run the scoobi configurations.
[-- scoobi VALUE1.VALUE2.VALUE3]
What does this syntax mean? Generally, configs are of form Key=Value.
One example is inmemory, if i want to use it, do i specify as inmemory=true? inmemory.true or true.inmemory . All three did not work and the job always ran in local hadoop mode instead of scala collection.



Regards,
Deepak

Deepak Jain

unread,
Sep 29, 2014, 1:04:46 AM9/29/14
to scoob...@googlegroups.com
Is this the right forum for scoobi user ?  I did not find scoobi-user group.

Eric Torreborre

unread,
Sep 29, 2014, 6:15:19 PM9/29/14
to scoob...@googlegroups.com
Hi, 

1. You can use the HADOOP_CONF_DIR variable to point to the directory containing the cluster configuration.

2. the Scoobi arguments are indeed unusual. If you want to run inmemory you just write

... -- scoobi inmemory

and if you want "verbose" as well:

... -- scoobi inmemory.verbose

Eric.

PS: the forum for scoobi users is here.

Deepak Jain

unread,
Sep 30, 2014, 1:42:04 AM9/30/14
to scoob...@googlegroups.com
1)
It did not work. I still see hadoop job submitted in local mode instead of cluster. I verified the config dir and its correct.

$ export HADOOP_CONF_DIR=/etc/hadoop/conf.empty


[dvasthimal@invisio-365818 my-app]$ rm -rf output.scoobie*; sbt "run-main WordCount -Dmapred.max.map.failures.percent=20 -Dmapred.max.reduce.failures.percent=20 input.txt output.scoobie.3 -- scoobi verbose"
[info] Set current project to MyApplication (in build file:/home/dvasthimal/my-app/)
[info] Running WordCount -Dmapred.max.map.failures.percent=20 -Dmapred.max.reduce.failures.percent=20 input.txt output.scoobie.3 -- scoobi verbose
[WARN] NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[INFO] deprecation - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
[INFO] deprecation - mapred.map.child.log.level is deprecated. Instead, use mapreduce.map.log.level
[INFO] deprecation - mapred.reduce.child.log.level is deprecated. Instead, use mapreduce.reduce.log.level
[INFO] deprecation - mapred.max.map.failures.percent is deprecated. Instead, use mapreduce.map.failures.maxpercent
[INFO] deprecation - mapred.max.reduce.failures.percent is deprecated. Instead, use mapreduce.reduce.failures.maxpercent
[INFO] deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
[INFO] HadoopMode - SCOOBI version: 0.8.5, commit: 7d0e802, timestamp: 08-07-2014 09:32:43 +1000
[INFO] Sink - Output path: output.scoobie.3
[INFO] Source - Input path: input.txt (1.63 KiB)
[INFO] HadoopMode - =====================================================================
[INFO] HadoopMode - ===== START OF SCOOBI JOB 'WordCount$-0929-223928-131376653' ========
[INFO] HadoopMode - =====================================================================

[INFO] HadoopMode - Executing map reduce jobs
Mscr(1

  inputs: + GbkInputChannel(Load (1)[String] (source 1) )

          mappers 
          ParallelDo (17)[String,(String,Int),((Unit,Unit),Unit)] (bridge aea01) 

          last mappers 
          ParallelDo (17)[String,(String,Int),((Unit,Unit),Unit)] (bridge aea01) 

  outputs: + GbkOutputChannel(GroupByKey (18)[String,Int] (bridge 4e21c) , combiner = Combine (19)[String,Int] (bridge ddd72) [sinks: Some(output.scoobie.3)]))
[INFO] JobSubmitter - Submitting tokens for job: job_local1532450561_0001
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572837/32115b15-94fc-461e-a0a9-4c6e2154af4d <- /home/dvasthimal/my-app/32115b15-94fc-461e-a0a9-4c6e2154af4d
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/env/32115b15-94fc-461e-a0a9-4c6e2154af4d as file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572837/32115b15-94fc-461e-a0a9-4c6e2154af4d
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572838/scoobi.metadata.TK20 <- /home/dvasthimal/my-app/scoobi.metadata.TK20
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TK20 as file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572838/scoobi.metadata.TK20
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572839/scoobi.metadata.TV20 <- /home/dvasthimal/my-app/scoobi.metadata.TV20
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TV20 as file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572839/scoobi.metadata.TV20
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572840/scoobi.metadata.TP20 <- /home/dvasthimal/my-app/scoobi.metadata.TP20
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TP20 as file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572840/scoobi.metadata.TP20
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572841/scoobi.metadata.TG20 <- /home/dvasthimal/my-app/scoobi.metadata.TG20
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TG20 as file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572841/scoobi.metadata.TG20
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572842/scoobi.mappers-step_1_of_1 <- /home/dvasthimal/my-app/scoobi.mappers-step_1_of_1
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.mappers-step_1_of_1 as file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572842/scoobi.mappers-step_1_of_1
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572843/scoobi.combiners-step_1_of_1 <- /home/dvasthimal/my-app/scoobi.combiners-step_1_of_1
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.combiners-step_1_of_1 as file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572843/scoobi.combiners-step_1_of_1
[INFO] LocalDistributedCacheManager - Creating symlink: /tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572844/scoobi.reducers-step_1_of_1 <- /home/dvasthimal/my-app/scoobi.reducers-step_1_of_1
[INFO] LocalDistributedCacheManager - Localized file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.reducers-step_1_of_1 as file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572844/scoobi.reducers-step_1_of_1
[INFO] deprecation - mapred.cache.localFiles is deprecated. Instead, use mapreduce.job.cache.local.files
[INFO] Job - The url to track the job: http://localhost:8080/
[INFO] LocalJobRunner - OutputCommitter set in config null
[INFO] MapReduceJob - MapReduce job 'job_local1532450561_0001' submitted. Please see http://localhost:8080/ for more info.
[INFO] LocalJobRunner - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
[INFO] LocalJobRunner - Waiting for map tasks
[INFO] LocalJobRunner - Starting task: attempt_local1532450561_0001_m_000000_0
[INFO] Task -  Using ResourceCalculatorProcessTree : [ ]
[INFO] MapTask - Processing split: file:/home/dvasthimal/my-app/input.txt:0+1664 (on channel:1)
[INFO] MapTask - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
[INFO] MapTask - (EQUATOR) 0 kvi 26214396(104857584)
[INFO] MapTask - mapreduce.task.io.sort.mb: 100
[INFO] MapTask - soft limit at 83886080
[INFO] MapTask - bufstart = 0; bufvoid = 104857600
[INFO] MapTask - kvstart = 26214396; length = 6553600
[INFO] MapTask - the URL of Java (evidenced with the java.lang.String class) is jar:file:/home/dvasthimal/invisio/jdk1.7.0_67/jre/lib/rt.jar!/java/lang/String.class
[INFO] MapTask - the URL of Scala (evidenced with the scala.collection.immutable.Range class) is jar:file:/home/dvasthimal/.sbt/boot/scala-2.10.3/lib/scala-library.jar!/scala/collection/immutable/Range.class
[INFO] MapTask - the URL of Hadoop (evidenced with the org.apache.hadoop.io.Writable class) is jar:file:/home/dvasthimal/.ivy2/cache/org.apache.hadoop/hadoop-common/jars/hadoop-common-2.2.0.jar!/org/apache/hadoop/io/Writable.class
[INFO] MapTask - the URL of Avro (evidenced with the org.apache.avro.Schema class) is jar:file:/home/dvasthimal/.ivy2/cache/org.apache.avro/avro/jars/avro-1.7.4.jar!/org/apache/avro/Schema.class
[INFO] MapTask - the URL of Kiama (evidenced with the org.kiama.rewriting.Rewriter class) is jar:file:/home/dvasthimal/.ivy2/cache/com.googlecode.kiama/kiama_2.10/jars/kiama_2.10-1.6.0.jar!/org/kiama/rewriting/Rewriter.class
[INFO] MapTask - the URL of Scoobi (evidenced with the com.nicta.scoobi.core.ScoobiConfiguration class) is jar:file:/home/dvasthimal/.ivy2/cache/com.nicta/scoobi_2.10/jars/scoobi_2.10-0.8.5.jar!/com/nicta/scoobi/core/ScoobiConfiguration.class
[INFO] deprecation - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.mappers-step_1_of_1 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572842/scoobi.mappers-step_1_of_1
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.mappers-step_1_of_1
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.mappers-step_1_of_1
[INFO] MapTask - Starting on invisio-365818
[INFO] MapTask - Input is file:/home/dvasthimal/my-app/input.txt:0+1664 (on channel:1)
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/env/32115b15-94fc-461e-a0a9-4c6e2154af4d (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572837/32115b15-94fc-461e-a0a9-4c6e2154af4d
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/env/32115b15-94fc-461e-a0a9-4c6e2154af4d
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/env/32115b15-94fc-461e-a0a9-4c6e2154af4d
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TK20 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572838/scoobi.metadata.TK20
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TK20
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TK20
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TV20 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572839/scoobi.metadata.TV20
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TV20
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TV20
[INFO] LocalJobRunner - 
[INFO] MapTask - Starting flush of map output
[INFO] MapTask - Spilling map output
[INFO] MapTask - bufstart = 0; bufend = 3183; bufvoid = 104857600
[INFO] MapTask - kvstart = 26214396(104857584); kvend = 26213524(104854096); length = 873/6553600
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.combiners-step_1_of_1 (memoise=true)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572843/scoobi.combiners-step_1_of_1
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.combiners-step_1_of_1
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.combiners-step_1_of_1
[INFO] MapTask - Finished spill 0
[INFO] Task - Task:attempt_local1532450561_0001_m_000000_0 is done. And is in the process of committing
[INFO] LocalJobRunner - map
[INFO] Task - Task 'attempt_local1532450561_0001_m_000000_0' done.
[INFO] LocalJobRunner - Finishing task: attempt_local1532450561_0001_m_000000_0
[INFO] LocalJobRunner - Map task executor complete.
[INFO] Task -  Using ResourceCalculatorProcessTree : [ ]
[INFO] Merger - Merging 1 sorted segments
[INFO] Merger - Down to the last merge-pass, with 1 segments left of total size: 2186 bytes
[INFO] LocalJobRunner - 
[INFO] deprecation - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
[INFO] MapReduceJob - Map 100%    Reduce   0%
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.reducers-step_1_of_1 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572844/scoobi.reducers-step_1_of_1
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.reducers-step_1_of_1
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.reducers-step_1_of_1
[INFO] ReduceTask - Starting on invisio-365818
[INFO] OutputChannel - Outputs are Some(output.scoobie.3)
[INFO] DistCache - trying to pull an object from the cache at path: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TG20 (memoise=false)
[INFO] DistCache - trying to open: file://file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/step_1_of_1/1412055572841/scoobi.metadata.TG20
[INFO] DistCache - trying to open: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TG20
[INFO] DistCache - successfully opened: file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/dist-objs/scoobi.metadata.TG20
[INFO] deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
[INFO] Task - Task:attempt_local1532450561_0001_r_000000_0 is done. And is in the process of committing
[INFO] LocalJobRunner - 
[INFO] Task - Task attempt_local1532450561_0001_r_000000_0 is allowed to commit now
[INFO] FileOutputCommitter - Saved output of task 'attempt_local1532450561_0001_r_000000_0' to file:/tmp/scoobi-dvasthimal/WordCount$-0929-223928-131376653/tmp-out-step_1_of_1/_temporary/0/task_local1532450561_0001_r_000000
[INFO] LocalJobRunner - reduce > reduce
[INFO] Task - Task 'attempt_local1532450561_0001_r_000000_0' done.
[INFO] MapReduceJob - Map 100%    Reduce 100%
[INFO] HadoopMode - Map reduce job sinks:  Vector(TextFileSink: output.scoobie.3) 
[INFO] HadoopMode - ===== END OF MAP REDUCE JOB 1 of 1 (mscr id = 1, Scoobi job = WordCount$-0929-223928-131376653) ======

[INFO] HadoopMode - =====================================================================
[INFO] HadoopMode - ===== END OF SCOOBI JOB 'WordCount$-0929-223928-131376653'   ========
[INFO] HadoopMode - =====================================================================

Not interrupting system thread Thread[process reaper,10,system]
Not interrupting system thread Thread[process reaper,10,system]
Not interrupting system thread Thread[process reaper,10,system]
Not interrupting system thread Thread[process reaper,10,system]
[success] Total time: 8 s, completed Sep 29, 2014 10:39:34 PM
[dvasthimal@invisio-365818 my-app]$ 
Reply all
Reply to author
Forward
0 new messages