Regarding hadoop in thrax-run

76 views
Skip to first unread message

bibek....@gmail.com

unread,
Oct 18, 2013, 2:49:20 AM10/18/13
to joshua_...@googlegroups.com
Hi,
I wanted to ask about map-reduce steps in hadoop.
There are so many redundant steps. First there is spilling then merging and then there is spilling and merging again with sorting ocasssionally. Is it mentioned somewhere about the log in details. How i can know if everything is running smoothly. Because it runs for an hour but i know nothing about what is happening.

Juri Ganitkevitch

unread,
Oct 18, 2013, 11:03:19 AM10/18/13
to Joshua Support
Hi Bibek,

The spilling and merging are Hadoop-level actions. Hadoop maintains in-memory buffers for generated map output (like, say, observed phrase pairs). When the buffer runs full it spills them to disk, then proceeds to refill the buffer with new ones until the map step is done. Similarly, the merging takes place when the reducer is sorting the various map outputs, but can't fit all of them into the in-memory buffer.

These steps are Hadoop moving work onto the disks to retain a small memory footprint. I believe there are ways to configure local Hadoop to take advantage of more memory, which would cut down on the amount of on-disk work. I'll put that on our to-do list.

-- Juri


--
You received this message because you are subscribed to the Google Groups "Joshua Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to joshua_suppor...@googlegroups.com.
To post to this group, send email to joshua_...@googlegroups.com.
Visit this group at http://groups.google.com/group/joshua_support.
For more options, visit https://groups.google.com/groups/opt_out.

bibek....@gmail.com

unread,
Oct 19, 2013, 2:52:15 PM10/19/13
to joshua_...@googlegroups.com, bibek....@gmail.com

Is there a way to know the disk space left when hadoop is running because it raised exception when there was sufficient memory in disk space. Then continued to run. Here is the output to df command
df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             96120588  50450356  40787496  56% /
none                   4056624       192   4056432   1% /dev
none                   4096464       164   4096300   1% /dev/shm
none                   4096464       100   4096364   1% /var/run
none                   4096464         0   4096464   0% /var/lock
/dev/sda6            282732224 129287792 139082424  49% /data1
/dev/sda7            282732224 162811600 105558616  61% /data2
/dev/sda8            284462680 270011744      1068 100% /data3

Here is the segment of log file.
org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:192)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
    at java.io.DataOutputStream.write(DataOutputStream.java:107)
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.writeChunk(ChecksumFileSystem.java:354)
    at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150)
    at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
    at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121)
    at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
    at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
    at java.io.DataOutputStream.write(DataOutputStream.java:107)
    at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1013)
    at org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:74)
    at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
    at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
    at edu.jhu.thrax.hadoop.features.mapred.RarityPenaltyFeature$Reduce.reduce(RarityPenaltyFeature.java:56)
    at edu.jhu.thrax.hadoop.features.mapred.RarityPenaltyFeature$Reduce.reduce(RarityPenaltyFeature.java:45)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
Caused by: java.io.IOException: No space left on device
    at java.io.FileOutputStream.writeBytes(Native Method)
    at java.io.FileOutputStream.write(FileOutputStream.java:297)
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:190)
    ... 22 more
13/10/20 08:26:24 WARN mapred.LocalJobRunner: job_local_0004
org.apache.hadoop.fs.FSError: java.io.IOException: No space left on device
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:192)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
    at java.io.DataOutputStream.write(DataOutputStream.java:107)
    at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.writeChunk(ChecksumFileSystem.java:354)
    at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150)
    at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
    at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121)
    at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
    at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
    at java.io.DataOutputStream.write(DataOutputStream.java:107)
    at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1013)
    at org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:74)
    at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.write(ReduceTask.java:508)
    at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
    at edu.jhu.thrax.hadoop.features.mapred.LexicalProbabilityFeature$Reduce.reduce(LexicalProbabilityFeature.java:90)
    at edu.jhu.thrax.hadoop.features.mapred.LexicalProbabilityFeature$Reduce.reduce(LexicalProbabilityFeature.java:51)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
Caused by: java.io.IOException: No space left on device
    at java.io.FileOutputStream.writeBytes(Native Method)
    at java.io.FileOutputStream.write(FileOutputStream.java:297)
    at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write(RawLocalFileSystem.java:190)
    ... 22 more
13/10/20 08:26:26 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:26 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:29 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:29 INFO mapred.LocalJobRunner: reduce > sort
[SCHED] class edu.jhu.thrax.hadoop.features.mapred.RarityPenaltyFeature in state FAILED
[SCHED] class edu.jhu.thrax.hadoop.jobs.OutputJob in state PREREQ_FAILED
[SCHED] class edu.jhu.thrax.hadoop.features.mapred.LexicalProbabilityFeature in state FAILED
13/10/20 08:26:32 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:32 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:35 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:35 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:38 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:38 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:41 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:41 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:44 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:44 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:49 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:49 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:52 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:54 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:55 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:57 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:26:57 INFO mapred.Merger: Merging 10 intermediate segments out of a total of 334
13/10/20 08:26:58 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:27:00 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:27:01 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:27:03 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:27:05 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:27:06 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:27:08 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:27:09 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:27:11 INFO mapred.LocalJobRunner: reduce > sort
13/10/20 08:27:12 INFO mapred.LocalJobRunner: reduce > sort

Phu Le

unread,
Oct 19, 2013, 3:35:06 PM10/19/13
to joshua_...@googlegroups.com, bibek....@gmail.com
From your df result, I can see /data3 is full. No?
Can you check if that's the partition your hadoop instance writes jobs' data to?

LTVP

Bibek Behera

unread,
Oct 19, 2013, 3:58:19 PM10/19/13
to Phu Le, joshua_...@googlegroups.com
Yes, you are correct. Which file can i see the partition used by hadoop for writing jobs' data? IF there is a way to modify this?
--
Regards,
Bibek behera
IIT Bombay


Ph no.- 8879005749

Phu Le

unread,
Oct 19, 2013, 4:08:53 PM10/19/13
to Bibek Behera, joshua_...@googlegroups.com
I can suggest this way to change the directory: Assuming you want it to write to directory YOUR_BASE_DIR in a partition with sufficient memory.
- Create/edit this configuration xml file in your working directory:

Change ./working/hadoop-0.20.2/conf/mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>hadoop.tmp.dir</name>
        <value><YOUR_BASE_DIR>/mapred/tmp</value>
    </property>

    <property>
        <name>mapred.local.dir</name>
        <value><YOUR_BASE_DIR>/mapred/local</value>
        <final>true</final>
    </property>

    <property>
        <name>mapred.system.dir</name>
        <value><YOUR_BASE_DIR>/mapred/system</value>
        <final>true</final>
    </property>
</configuration>

Then rerun your experiment.
When it is running, you can issue the command "ls <YOUR_BASE_DIR>/mapred" to verify whether it is writing to that folder as you expected.

LTVP

P/S: there should be a way to change the template directly in the hadoop shipped with Joshua, so that it defaults to write to that folder. Unfortunately, I don't know exactly. Others might help ;)

Matt Post

unread,
Oct 21, 2013, 12:25:07 PM10/21/13
to joshua_...@googlegroups.com, Bibek Behera
This is a good solution. If you want something more formal, it would be good to submit a but report to github (github.com/joshua-decoder/joshua/issues). Better yet, make the fix yourself and issue a pull request.
Reply all
Reply to author
Forward
0 new messages