How to set the scoobi temp directory

56 views
Skip to first unread message

Amit Jaiswal

unread,
Jun 10, 2014, 5:47:02 PM6/10/14
to scoobi...@googlegroups.com
HI,

Scoobi by default uses hdfs://tmp/scoobi-<username> as the base temporary directory. I want to change this directory to some other location which has more space.

The ScoobiConfiguration.setScoobiDir() API is not working as expected. Even after I set the temp directory, the temporary data goes partially into the old temp directory and partially into new. It gets worse and sometimes the output of jobs is 0 records. I also tried setting the properties scoobi.dir and scoobi.workingdir, but no luck so far.

Can somebody please check and tell how to change the temporary directory.
Ticket related to this issue: https://groups.google.com/forum/#!msg/scoobi-dev/ze0nvoTZi8M/HX2ROJ0b1poJ

Thanks,
Amit

Eric Torreborre

unread,
Jun 12, 2014, 1:43:48 AM6/12/14
to scoobi...@googlegroups.com
Hi Amit, 

You might want to have a look at the latest SNAPSHOT.

There are now 4 properties you can set on the configuration to select directories:

  def defaultScoobiDir      = dirPath("/tmp/scoobi-"+sys.props.get("user.name").getOrElse("user"))
  lazy val scoobiDir          = configuration.getOrSet("scoobi.dir", defaultScoobiDir)
  lazy val workingDir         = configuration.getOrSet("scoobi.workingdir", dirPath(scoobiDir + jobId))
  lazy val tempDirName    = configuration.getOrSet("scoobi.tempdirname", dirPath("tmp-out-"+hadoopConfiguration.get(JOB_STEP)+"/"))
  lazy val temporaryDir     = configuration.getOrSet("scoobi.tempdir", workingDir+tempDirName)

This means that if you define the "scoobi.tempdir" property you should be able to direct output files wherever you need. *However* this needs to be done at the beginning of the ScoobiApp before those lazy val get initialised. Otherwise they will be initialised with default values which might be what you are observing now.

Eric.
Reply all
Reply to author
Forward
0 new messages