Run Time Error - Reducer - Default time 600 secs

68 views

Skip to first unread message

Konstantinos Mammas

unread,

Jun 20, 2015, 11:51:55 AM6/20/15

to rha...@googlegroups.com

Hello all,

I am using rhadoop in order to compute a correlation matrix according to a specified key. The codes works fine in local and hadoop backend as well. The problem is that when the dataset in increased twice (from 20 millions to 50 millions of rows ), the computation time increases too.

I think that my problem has to make with the time response time of the task. The error I am getting is:

Task attempt_201503181218_296033_r_000000_0 failed to report status for 600 seconds. Killing!. Diagnostic information will be saved in userlogs

It is true that when I tried to compute the correlation matrix for a specific key it was taking quite long to compute. From a search in google I found some posts refer that I must make additional parameterizations mapred-site.xml. Below you may find a relevant link:

http://stackoverflow.com/questions/15281307/the-reduce-fails-due-to-task-attempt-failed-to-report-status-for-600-seconds-ki

If this is the case, how should I change default the time? The default time is set to 600000 ms = 600 secs.

Below you may find the relevant stderr logs:

stderr logs

log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.NativeCodeLoader).
log4j:WARN Please initialize the log4j system properly.
Loading objects:
  associateAttributeLevel
  associateVariousLevels
  attributeAlignment
  census
  ChisqAttributeAlignment
  cmdPath
  CorrelClusterAlgorithm
  CorrelClusterMapReduceExecutor
  CorrelClusterParallelExecutor
  curDirectory
  currTime
  email_to
  filesToValidate
  fileToUpload
  finalTableName
  ftpLnx
  GenStartFiles
  GetFTPcsvFile
  GetStoreList
  GetStubFileFromServer
  host_id
  jobNum
  lastWeek
  listFiles
  listFilesTable
  lnxHost_id
  lnxPassWord
  lnxPath
  lnxUserName
  lnxuserPass
  ManipulateSTUB
  ManipulationModeldataR
  mapperCorrelClusterfunction
  mapReduceCorrelaClusterFlag
  MapReduceCorrelClusterAlgorithm
  modeldata
  modelDataFileName
  moveFilesToHadoop
  .N
  numCores
  numOfPeriods
  outputFolder
  passWord
  path
  path_id
  produceCMDFiles
  .Random.seed
  reducerCorrelClusterfunction
  saveRDSName
  splitDataPath
  storeListTable
  storesFile
  storesTables
  stub
  stubData1
  stubFileName
  stubFiles
  substrRight
  syshostname
  sysuserid
  tempMatrix
  tempMatrixNumeric
  .testLogger
  trim.leading
  trim.trailing
  UpcSelectFileTable
  UpcSelectTable
  upcToPPGConverter
  url_id
  userName
Loading objects:
  backend.parameters
  combine
  combine.file
  combine.line
  debug
  default.input.format
Warning: S3 methods ‘gorder.default’, ‘gorder.factor’, ‘gorder.data.frame’, ‘gorder.matrix’, ‘gorder.raw’ were declared in NAMESPACE but not found
Please review your hadoop settings. See help(hadoop.settings)
  default.output.format
  in.folder
  in.memory.combine
  input.format
  libs
  map
  map.file
  map.line
  out.folder
  output.format
  pkg.opts
  postamble
  preamble
  profile.nodes
  reduce
  reduce.file
  reduce.line
  rmr.global.env
  rmr.local.env
  save.env
  tempfile
  vectorized.reduce
  verbose
  work.dir
Loading required package: methods
Loading required package: data.table

Attaching package: ‘data.table’

The following object is masked _by_ ‘.GlobalEnv’:

    .N

Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
Loading required package: doParallel
Loading required package: rmr2
Loading required package: plyr
Loading required package: rJava
Loading required package: rhdfs
Error : .onLoad failed in loadNamespace() for 'rhdfs', details:
  call: fun(libname, pkgname)
  error: Environment variable HADOOP_CMD must be set before loading package rhdfs
Warning in FUN(c("base", "methods", "datasets", "utils", "grDevices", "graphics",  :
  can't load rhdfs
Loading required package: stringr
Loading required package: bitops
Loading required package: RCurl

Attaching package: ‘RCurl’

The following object is masked from ‘package:rJava’:

    clone

Loading objects:
  backend.parameters
  combine
  combine.file
  combine.line
  debug
  default.input.format
  default.output.format
  in.folder
  in.memory.combine
  input.format
  libs
  map
  map.file
  map.line
  out.folder
  output.format
  pkg.opts
  postamble
  preamble
  profile.nodes
  reduce
  reduce.file
  reduce.line
  rmr.global.env
  rmr.local.env
  save.env
  tempfile
  vectorized.reduce
  verbose
  work.dir
Loading required package: grid
Loading required package: lattice
Loading required package: survival
Loading required package: splines
Loading required package: Formula
Loading required package: ggplot2

Attaching package: ‘Hmisc’

The following objects are masked from ‘package:plyr’:

    is.discrete, summarize

The following objects are masked from ‘package:base’:

    format.pval, round.POSIXt, trunc.POSIXt, units

2015-06-20 10:24:16,7760 ERROR Client fs/client/fileclient/cc/writebuf.cc:154 Thread: 5405 FlushWrite failed: File part-00000, error: Stale File handle(116), pfid 2149.29771415.392936198, off 0, fid 2149.29771415.392936198
log4j:ERROR Could not write to: com.mapr.fs.MapRFsOutStream@7e21e65f. Failing over to local logging
java.io.IOException: stream closed
	at com.mapr.fs.MapRFsOutStream.checkClosed(MapRFsOutStream.java:43)
	at com.mapr.fs.MapRFsOutStream.write(MapRFsOutStream.java:80)
	at com.mapr.fs.MapRFsDataOutputStream.write(MapRFsDataOutputStream.java:46)
	at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
	at com.mapr.log4j.MaprfsLogAppender.append(MaprfsLogAppender.java:419)
	at com.mapr.log4j.CentralTaskLogAppender.append(CentralTaskLogAppender.java:102)
	at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
	at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
	at org.apache.log4j.Category.callAppenders(Category.java:206)
	at org.apache.log4j.Category.forcedLog(Category.java:391)
	at org.apache.log4j.Category.log(Category.java:856)
	at org.apache.commons.logging.impl.Log4JLogger.error(Log4JLogger.java:181)
	at com.mapr.fs.LoggerProxy.error(LoggerProxy.java:18)
	at com.mapr.fs.Inode.flushPages(Inode.java:421)
	at com.mapr.fs.Inode.syncInternal(Inode.java:530)
	at com.mapr.fs.Inode.sync(Inode.java:543)
	at com.mapr.fs.Inode.closeWrite(Inode.java:553)
	at com.mapr.fs.Inode.close(Inode.java:1006)
	at com.mapr.fs.MapRFsOutStream.close(MapRFsOutStream.java:223)
	at com.mapr.fs.Inode.closeAll(Inode.java:1078)
	at com.mapr.fs.BackgroundWork.close(BackgroundWork.java:99)
	at com.mapr.fs.MapRFileSystem.close(MapRFileSystem.java:1236)
	at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1626)
	at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:1596)
log4j:WARN com.mapr.log4j.CentralTaskLogAppender@61579cd: closed and disabled due to errors.
java.io.IOException: stream closed
	at com.mapr.fs.MapRFsOutStream.checkClosed(MapRFsOutStream.java:43)
	at com.mapr.fs.MapRFsOutStream.write(MapRFsOutStream.java:80)
	at com.mapr.fs.MapRFsDataOutputStream.write(MapRFsDataOutputStream.java:46)
	at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
	at com.mapr.log4j.MaprfsLogAppender.append(MaprfsLogAppender.java:419)
	at com.mapr.log4j.CentralTaskLogAppender.append(CentralTaskLogAppender.java:102)
	at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
	at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
	at org.apache.log4j.Category.callAppenders(Category.java:206)
	at org.apache.log4j.Category.forcedLog(Category.java:391)
	at org.apache.log4j.Category.log(Category.java:856)
	at org.apache.commons.logging.impl.Log4JLogger.error(Log4JLogger.java:181)
	at com.mapr.fs.LoggerProxy.error(LoggerProxy.java:18)
	at com.mapr.fs.Inode.flushPages(Inode.java:421)
	at com.mapr.fs.Inode.syncInternal(Inode.java:530)
	at com.mapr.fs.Inode.sync(Inode.java:543)
	at com.mapr.fs.Inode.closeWrite(Inode.java:553)
	at com.mapr.fs.Inode.close(Inode.java:1006)
	at com.mapr.fs.MapRFsOutStream.close(MapRFsOutStream.java:223)
	at com.mapr.fs.Inode.closeAll(Inode.java:1078)
	at com.mapr.fs.BackgroundWork.close(BackgroundWork.java:99)
	at com.mapr.fs.MapRFileSystem.close(MapRFileSystem.java:1236)
	at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1626)
	at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:1596)
log4j:ERROR Attempted to append to closed appender named [maprfsTLA].
log4j:ERROR Attempted to append to closed appender named [maprfsTLA].

Any assistance is appreciated.

Best Regards,

Kostas

Antonio Piccolboni

unread,

Jul 1, 2015, 5:53:34 PM7/1/15

to rha...@googlegroups.com, mamm...@gmail.com

Please provide a test case. Thanks.

Reply all

Reply to author

Forward

0 new messages