yarn. container problem

90 views
Skip to first unread message

anurag agrahari

unread,
Apr 7, 2015, 3:14:42 AM4/7/15
to rha...@googlegroups.com
Respected groupmember,

i am getting the error on running following code in rmr ...

my rmr.option setting are

bp = rmr.options("backend.parameters")
                         bp$hadoop[1] = "mapreduce.reduce.java.opts=-Xmx800M";
                         bp$hadoop[2] = "mapreduce.reduce.java.opts=-Xmx2048M";

i have three doubt on this,may be due to so this program not working fine....i coated that line highlighted mark..
On Monday, November 26, 2012 at 3:12:09 AM UTC+5:30, cyang05 wrote:
Sure.
I just wonder if we can take control of the number of reducers. (If the keys are too many to fit in size of the main memory?)

First the data set is from this page http://fimi.ua.ac.be/data/
The last data set on that page -- webdocs.dat.gz (1.4Gb when unzipped)

The code is

#  1692082 //Total redords (1.4Gb)

library(rmr2)
library(arules)

##  Local apriori finding frequent itemsets with !support=0.5!
        readRows <- function(file, sep="\n", split=" ", ...){
        tt <- strsplit(
                        scan(file, what="list", sep=sep, ...),
                        split=split)
                out <- lapply(tt, function(i) as.numeric(i))
                out
}

        webdoc <- readRows('/home/hduser/data/webdoc-10000.dat')  ///whether this is local user file system or hdfs user file.
        tr_webdoc <- as(webdoc, "transactions")
        fItemL <- apriori(tr_webdoc, parameter=new("APparameter", support=0.5,target="frequent itemsets", maxlen=5))



##  paralell-apriori finding frequent itemsets with !support=0.3!
##  Reason: Because we are finding "local" frequent itemset's support count in the mapper, then in the reducer we accumulate the support count for each local-frequent itemset. (An itemset with global support 0.5 will not have local support of 0.5 on every datanode, so we will find relative lower local-support 0.3 itemset's support count in the mapper. Then after the mapreduce we can get global-frequent itemset with a higher support by elimanating itemsets below the higher support 0.5, see below.)

        papriori =
          function(
                input,
                output = NULL,
                pattern = " ",
                support=0.3,
                maxlen=5 #This is important!
                ){

                        ## papriori-map
                        pa.map =
                          function(., lines) {
                                if((LL=length(lines))>5000){
                                fItems <- apriori(as(lapply(strsplit(
                                                                               x = lines,
                                                                               split = pattern), unique),
                                                                        "transactions"),
                                                                parameter=new("APparameter",
                                                                               support=support,
                                                                               target="frequent itemsets",
                                                                               maxlen=maxlen))

                                recNum <- fItems@info$ntransactions[1]

                                keyval(as(items(fItems), "list"),
                                        fItems@quality$support*recNum)}
                                else
                                keyval(list("-1"),LL) #Number of records skiped.
                        }

                        ## papiori-reduce
                        pa.reduce =
                          function(word, counts ) {
                                keyval(word, sum(counts))}

                        ## papiori-mapreduce
                        mapreduce(
                          input = input ,
                          output = output,
                          input.format = "text",
                          map = pa.map,
                          reduce = pa.reduce,
                          combine = T)
        }


        rmr.options(backend = "hadoop")
        rmr.options(keyval.length=10000/////how to declare the keyval.length

        out.hadoop = from.dfs(papriori("/user/hduser/webdoc", pattern = " +"))  ///whether this is local or hdfs user data... if no then when i fire the commend this give me error path not found...if yes it give me following error
 
packageJobJar: [] [/usr/hdp/2.2.0.0-2041/hadoop-mapreduce/hadoop-streaming-2.6.0.2.2.0.0-2041.jar] /tmp/streamjob5530321227424655553.jar tmpDir=null
15/04/07 12:23:11 INFO impl.TimelineClientImpl: Timeline service address: http://n2.example.com:8188/ws/v1/timeline/
15/04/07 12:23:11 INFO client.RMProxy: Connecting to ResourceManager at n1.example.com/172.16.6.181:8050
15/04/07 12:23:12 INFO impl.TimelineClientImpl: Timeline service address: http://n2.example.com:8188/ws/v1/timeline/
15/04/07 12:23:12 INFO client.RMProxy: Connecting to ResourceManager at n1.example.com/172.16.6.181:8050
15/04/07 12:23:12 INFO mapred.FileInputFormat: Total input paths to process : 1
15/04/07 12:23:12 INFO mapreduce.JobSubmitter: number of splits:2
15/04/07 12:23:13 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1428381133028_0020
15/04/07 12:23:13 INFO impl.YarnClientImpl: Submitted application application_1428381133028_0020
15/04/07 12:23:13 INFO mapreduce.Job: The url to track the job: http://n1.example.com:8088/proxy/application_1428381133028_0020/
15/04/07 12:23:13 INFO mapreduce.Job: Running job: job_1428381133028_0020
15/04/07 12:23:18 INFO mapreduce.Job: Job job_1428381133028_0020 running in uber mode : false
15/04/07 12:23:18 INFO mapreduce.Job:  map 0% reduce 0%
15/04/07 12:23:27 INFO mapreduce.Job: Task Id : attempt_1428381133028_0020_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

15/04/07 12:23:27 INFO mapreduce.Job: Task Id : attempt_1428381133028_0020_m_000001_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

15/04/07 12:23:35 INFO mapreduce.Job: Task Id : attempt_1428381133028_0020_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

15/04/07 12:23:36 INFO mapreduce.Job: Task Id : attempt_1428381133028_0020_m_000001_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

15/04/07 12:23:45 INFO mapreduce.Job: Task Id : attempt_1428381133028_0020_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

15/04/07 12:23:46 INFO mapreduce.Job: Task Id : attempt_1428381133028_0020_m_000001_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:322)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

15/04/07 12:23:56 INFO mapreduce.Job:  map 100% reduce 100%
15/04/07 12:23:56 INFO mapreduce.Job: Job job_1428381133028_0020 failed with state FAILED due to: Task failed task_1428381133028_0020_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

15/04/07 12:23:56 INFO mapreduce.Job: Counters: 13
    Job Counters
        Failed map tasks=7
        Killed map tasks=1
        Launched map tasks=8
        Other local map tasks=6
        Data-local map tasks=2
        Total time spent by all maps in occupied slots (ms)=57372
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=57372
        Total vcore-seconds taken by all map tasks=57372
        Total megabyte-seconds taken by all map tasks=48938316
    Map-Reduce Framework
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
15/04/07 12:23:56 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce,  :
  hadoop streaming failed with error code 1
Called from: stop("hadoop streaming failed with error code ", retval, "\n")

      


help me ...
i am struck this problem from past 5 day...

thank you in advance
Reply all
Reply to author
Forward
0 new messages