DistCache not caching files locally

254 views
Skip to first unread message

Joseph Beynon

unread,
Apr 24, 2013, 5:41:49 PM4/24/13
to scoobi...@googlegroups.com
I got an email from our cluster administrator that my job was reading files from HDFS millions of times. The files it was reading were the scoobi.combiners and scoobi.metadata files.
It looks like in pullObject in DistCache.scala, you are using DistributedCache.getCacheFiles. My understanding is that this method returns the original URIs that were added to the cache (the HDFS path). I think it is meant to be changed to getLocalCacheFiles as the example in http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/filecache/DistributedCache.html shows.
If you need any more information please let me know.

Thanks,
Joe

Eric Torreborre

unread,
Apr 24, 2013, 8:17:47 PM4/24/13
to scoobi...@googlegroups.com
Hi Joseph,

I need to analyse the problem a bit more because if I try a change to getLocalCacheFiles, then I get a NPE because the files are not found.

Today is a public holiday, I'll investigate this issue tomorrow (if you know what's the exact change to be done and have time to submit a pull request, that's welcome as well :-)).

Cheers,

Eric.

Joseph Beynon

unread,
Apr 25, 2013, 9:48:04 PM4/25/13
to scoobi...@googlegroups.com
It looks like getLocalCacheFiles isn't supported in local mode for Hadoop and instead returns null. Also, the way that it's looking up files (by recreating the path and using find) may not work. In the local files, the path will be completely different from on HDFS, ie it will be in some temp directory on the local disk. So it looks like this will be a bit trickier than I'd hoped.
So, I think, a solution looks something like the following. But I'm not sure how to check if we are running in local mode. One assumption I'm making though, is that the order of files in getLocalCacheFiles is the same as in getCacheFiles. The documentation isn't explicit, but otherwise, the distributed cache would be extremely difficult to use.
  def pullObject[T](configuration: Configuration, tag: String): Option[T] = {
    /* Get distributed cache file. */
    val path = mkPath(configuration, tag)
    val remoteCacheFiles = Option(DistributedCache.getCacheFiles(configuration))
      .getOrElse(Array[URI]())
    val localCacheFiles = Option(DistributedCache.getLocalCacheFiles(configuration))
      .getOrElse(Array[Path]())
      .map { _.toUri }
    val cacheFiles =  if (localMode) // TODO: Not sure how to check if we are in local mode.
        remoteCacheFiles.zip(remoteCacheFiles)
      else
        remoteCacheFiles.zip(localCacheFiles)
    cacheFiles.find(_._1.toString == path.toString).flatMap { case (_, uri) =>
      deserialise(configuration)(new Path(uri.toString))
    }
  }

Eric Torreborre

unread,
Apr 28, 2013, 10:47:09 PM4/28/13
to scoobi...@googlegroups.com
Hi Joseph,

I incorporated your changes, thanks (the ScoobiConfiguration has a method to know if the execution is on the cluster or not BTW).

They should be available in the next SNAPSHOT release this afternoon (Sydney time).

Cheers,

Eric.

Alex Cozzi

unread,
Aug 9, 2013, 5:34:43 PM8/9/13
to scoobi...@googlegroups.com
Hi Eric,
this problem somehow re-emerged on our cluster. The file that is accessed way too often is of the form:

/tmp/scoobi-username/scoobi-20130807-173359-RunSteps$-9dc8f6b7-ca4c-4daa-a492-f3ee9a088aa6/dist-objs/scoobi.metadata.TK1158-scoobi-20130807-173359-RunSteps$-9dc8f6b7-ca4c-4daa-a492-f3ee9a088aa6(Mscr-6)

and in the namenode logs it appears as few million instances of:

2013-08-09 13:15:29,768 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=username@mysiteip=/XX.XX.XX.XX cmd=open src=/tmp/scoobi-username/scoobi-20130807-173359-RunSteps$-9dc8f6b7-ca4c-4daa-a492-f3ee9a088aa6/dist-objs/scoobi.metadata.TV1159-scoobi-20130807-173359-RunSteps$-9dc8f6b7-ca4c-4daa-a492-f3ee9a088aa6(Mscr-7)dst=null perm=null

I looked for the logs that you added in the distcache in the mapper's log but I saw no WARN messages. 
we are running on the very latest 0.8.0-ch3 nightly builds of scoobi.
Do you have some idea of which scoobi code is opening that kind of file and what could be the issue?

Best
Alex



Alex Cozzi

unread,
Aug 13, 2013, 12:04:16 AM8/13/13
to scoobi...@googlegroups.com
I think we found the issue. If you look inside DistCache.scala file:

 /** Get an object that has been distributed so as to be available for tasks in
* the current job. */
  def pullObject[T](configuration: Configuration, tag: String): Option[T] = {
    val path = mkPath(configuration, tag)

    lazy val remoteCacheFiles = Option(DistributedCache.getCacheFiles(configuration))getOrElse(Array[URI]())
    lazy val localCacheFiles = Option(DistributedCache.getLocalCacheFiles(configuration)).getOrElse(Array[Path]()).map(_.toUri)

    // use the local cached files on the cluster when the local files can be found
    val cacheFiles =
      if (localCacheFiles.nonEmpty) localCacheFiles
      else { logger.warn("there are no local cache files, using the distributed cache instead")
                                      remoteCacheFiles }

    cacheFiles.find(uri => uri.toString.endsWith(path.toString)).flatMap { case uri =>
      deserialise(configuration)(path)
    }
  }


note this line:
  cacheFiles.find(uri => uri.toString.endsWith(path.toString)).flatMap { case uri =>
      deserialise(configuration)(path)
    }

basically it checks for the uri, but it de-serializes the path. That causes it to use the default URI scheme, which is basically DFS, thus by-passing the cache altogether. We modified the same line to force it to use the local file path as:

  cacheFiles.find(uri => uri.toString.endsWith(path.toString)).flatMap { case uri =>
      deserialise(configuration)(new Path(new URI("file://"+uri.toString)))
    }
 

and it seems to work correctly. I think that scoobi should be fixed by our code or a more elegant way to ensure that distcache reads the local file instead of the DFS distribute file.

Best,
Alex 

Eric Torreborre

unread,
Aug 13, 2013, 3:02:17 AM8/13/13
to scoobi...@googlegroups.com
Thanks Alex, this helps a lot. I'm going to learn more on this path/URI tweaking and see if I can come up with something correct.

E.

Joe Beynon

unread,
Aug 13, 2013, 4:56:34 AM8/13/13
to scoobi...@googlegroups.com

I think we just need to be careful to only add the file scheme when using localcachefiles.

--
You received this message because you are subscribed to a topic in the Google Groups "scoobi-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scoobi-users/Fhb_b1IjjvY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scoobi-users...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Joseph Beynon

unread,
Aug 13, 2013, 4:11:09 PM8/13/13
to scoobi...@googlegroups.com
After looking in a bit more we noticed that in ScoobiMetadata.scala, it is directly calling DistCache.deserialise instead of using pullObject. Alex and I looked at changing this but it seems that there are multiple "scoobi.metadata" files and not all of them are getting added to the distributed cache. If you look at the mapred.cache.files property in the final job configuration it we see at least one metadata file, but when the tasks run they request a different one. It seems like there may be some problem with timing of when the metadata object is created and added to the distributed cache, and when the configuration is copied to be submitted to the job tracker.

Joseph Beynon

unread,
Aug 13, 2013, 7:28:37 PM8/13/13
to scoobi...@googlegroups.com
Ok, I think I've got more information. In MapReduceJob.scala:configureKeysAndValues, it is using implicit ScoobiConfigurations passed into the TaggedPartioners and others. But at this point in the code, the configuration has already been passed into the Job constructor which makes a copy (I believe?), and so to edit the configuration the job actually uses we need to pass in job.getConfiguration (see the configureReducers method). And so the TaggedPartioner and others are actually just modifying a different configuration object than is actually being submitted.

Joseph Beynon

unread,
Aug 13, 2013, 10:46:19 PM8/13/13
to scoobi...@googlegroups.com
Alex and I have dug in even deeper. It looks like after fixing MapReduceJob to use job.getConfiguration, there are other problems. They stem from the Configured trait in TaggedKey.scala. This trait uses "new Configuration" which creates a basic configuration instead of using the current job's configuration. So when these classes are initialized in the mapper they are trying to look up distributed cache files from a configuration that doesn't think anything has been cached. This looks challenging to fix because the correct configuration needs to be passed down from somewhere higher up. We're looking into it as much as we can but it'd be helpful if you guys could as well since you have a better idea of how all of these metadata files are being used. Thanks.

Eric Torreborre

unread,
Aug 13, 2013, 11:42:10 PM8/13/13
to scoobi...@googlegroups.com
Hi Joseph, 

I am currently also trying to understand what's going on. I'll let you know when I have something that looks ok.

E.

Alex Cozzi

unread,
Aug 14, 2013, 12:04:17 AM8/14/13
to scoobi...@googlegroups.com
Thanks Eric,
as a reference, you can see the changes we made so far here: https://github.com/xelax/scoobi/commit/29bd5fb451612201687686dec5a1b80dd20a236e
Alex

Eric Torreborre

unread,
Aug 14, 2013, 3:53:23 AM8/14/13
to scoobi...@googlegroups.com
I am sorry that you have to deal with so much ugliness. At some stage it would be nice to rethink the whole configuration business but for now I did a few modifications to the DistCache and other classes (inspired by your own changes). Please tell me how 0.8.0-SNAPSHOT works for you

Eric Torreborre

unread,
Aug 14, 2013, 3:57:49 AM8/14/13
to scoobi...@googlegroups.com
The previous message was shot too fast...

I wanted to add that I actually implemented the Configured trait by making it extend the Hadoop "Configurable" trait. Then, when using the ReflectionUtils class, the Configuration is automatically passed to the newly created objects which allow them to grab things from local/distributed cache if needed.

Then in the DistCache, when pulling objects out, I'm doing a search of possible different files starting from the local ones (with "file://"), until there is one working.

I still need to wait for our cluster tests to validate that there's no regression but hopefully things will be ok (famous last words).

E.

Alex Cozzi

unread,
Aug 14, 2013, 1:36:35 PM8/14/13
to scoobi...@googlegroups.com
Thank you so much for your help, Eric, really appreciated.

I think that we are almost there, but the job fails whenever a mapper need to spill. I think that the jobconfig somehow is not passed down to the after the spill: you can see how the first time the file is read from the cache it works, but after the spill it fails. Attached is the log from one such mapper. Also I see that the 0.8.0-SNAPSHOT has been updated on maven, but not the cdh3 version.

2013-08-14 10:18:44,619 INFO scoobi.MapTask: Starting on myhost
2013-08-14 10:18:44,619 INFO scoobi.MapTask: Input is hdfs://myhost:8020/sys/edw/dw_lstg_item/snapshot/2013/08/13/00/part-r-00001:536870912+536870912 (on channel:11)
2013-08-14 10:18:44,671 INFO scoobi.DistCache: trying to pull an object from the cache at path: /tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/env/d26cc3b1-cc53-44da-a3b1-91c62cdab253
2013-08-14 10:18:44,673 INFO scoobi.DistCache: trying to open: file:/hadoop/1/scratch/taskTracker/distcache/-8498768754780428785_1337695511_2111141465/myhost/tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/env/d26cc3b1-cc53-44da-a3b1-91c62cdab253
2013-08-14 10:18:44,674 INFO scoobi.DistCache: successfully opened: file:/hadoop/1/scratch/taskTracker/distcache/-8498768754780428785_1337695511_2111141465/myhost/tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/env/d26cc3b1-cc53-44da-a3b1-91c62cdab253
2013-08-14 10:18:44,676 INFO scoobi.DistCache: trying to pull an object from the cache at path: /tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/env/086a56ca-7169-4964-8ff1-92180a37e23f
2013-08-14 10:18:44,677 INFO scoobi.DistCache: trying to open: file:/hadoop/2/scratch/taskTracker/distcache/-5392923895352759590_951072140_2111141469/myhost/tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/env/086a56ca-7169-4964-8ff1-92180a37e23f
2013-08-14 10:18:44,679 INFO scoobi.DistCache: successfully opened: file:/hadoop/2/scratch/taskTracker/distcache/-5392923895352759590_951072140_2111141469/myhost/tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/env/086a56ca-7169-4964-8ff1-92180a37e23f
2013-08-14 10:18:44,807 INFO scoobi.DistCache: trying to pull an object from the cache at path: /tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/dist-objs/scoobi.metadata.TP90
2013-08-14 10:18:44,808 INFO scoobi.DistCache: trying to open: file:/hadoop/7/scratch/taskTracker/distcache/-2772926115251135789_-2069565859_2111141904/myhost/tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/dist-objs/scoobi.metadata.TP90
2013-08-14 10:18:44,810 INFO scoobi.DistCache: successfully opened: file:/hadoop/7/scratch/taskTracker/distcache/-2772926115251135789_-2069565859_2111141904/myhost/tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/dist-objs/scoobi.metadata.TP90
2013-08-14 10:18:44,878 INFO scoobi.DistCache: trying to pull an object from the cache at path: /tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/dist-objs/scoobi.metadata.TK90
2013-08-14 10:18:44,879 INFO scoobi.DistCache: trying to open: file:/hadoop/5/scratch/taskTracker/distcache/7027002775065643939_-2069570664_2111141749/myhost/tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/dist-objs/scoobi.metadata.TK90
2013-08-14 10:18:44,880 INFO scoobi.DistCache: successfully opened: file:/hadoop/5/scratch/taskTracker/distcache/7027002775065643939_-2069570664_2111141749/myhost/tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/dist-objs/scoobi.metadata.TK90
2013-08-14 10:18:44,913 INFO scoobi.DistCache: trying to pull an object from the cache at path: /tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/dist-objs/scoobi.metadata.TV90
2013-08-14 10:18:44,914 INFO scoobi.DistCache: trying to open: file:/hadoop/6/scratch/taskTracker/distcache/3089233342640131732_-2069560093_2111141869/myhost/tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/dist-objs/scoobi.metadata.TV90
2013-08-14 10:18:44,916 INFO scoobi.DistCache: successfully opened: file:/hadoop/6/scratch/taskTracker/distcache/3089233342640131732_-2069560093_2111141869/myhost/tmp/scoobi-username/scoobi-20130814-101748-GbxTranslation$-f1b6777c-0e09-441f-b80d-9b19c6bbb8b8/dist-objs/scoobi.metadata.TV90
2013-08-14 10:20:01,885 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: record full = true
2013-08-14 10:20:01,885 INFO org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 72477576; bufvoid = 456340272
2013-08-14 10:20:01,885 INFO org.apache.hadoop.mapred.MapTask: kvstart = 0; kvend = 4026532; length = 5033165
2013-08-14 10:20:01,930 INFO scoobi.DistCache: trying to pull an object from the cache at path: /tmp/scoobi-username/scoobi-20130814-102001-6a49f75c-0432-4183-bb01-ad2a7484aadb/dist-objs/scoobi.metadata.TK90
2013-08-14 10:20:01,932 INFO scoobi.DistCache: trying to open: /tmp/scoobi-username/scoobi-20130814-102001-6a49f75c-0432-4183-bb01-ad2a7484aadb/dist-objs/scoobi.metadata.TK90
2013-08-14 10:20:01,937 ERROR scoobi.DistCache: No successfully opened path. The cache files which were used are
/tmp/scoobi-username/scoobi-20130814-102001-6a49f75c-0432-4183-bb01-ad2a7484aadb/dist-objs/scoobi.metadata.TK90

2013-08-14 10:20:01,943 INFO org.apache.hadoop.mapred.MapTask: Starting flush of map output
2013-08-14 10:20:01,944 INFO org.apache.hadoop.mapred.MapTask: Ignoring exception during close for org.apache.hadoop.mapred.MapTask$NewOutputCollector@779a639b
java.io.IOException: Spill failed
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1296)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:697)
at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1792)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:778)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:363)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:313)
at scala.None$.get(Option.scala:311)
at com.nicta.scoobi.impl.rtt.ScoobiMetadata$$anonfun$metadata$1$$anonfun$apply$1.apply(ScoobiMetadata.scala:45)
at com.nicta.scoobi.impl.rtt.ScoobiMetadata$$anonfun$metadata$1$$anonfun$apply$1.apply(ScoobiMetadata.scala:43)
at scalaz.MemoFunctions$$anonfun$mutableMapMemo$1$$anonfun$apply$2$$anonfun$apply$3.apply(Memo.scala:67)
at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:189)
at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:91)
at scalaz.MemoFunctions$$anonfun$mutableMapMemo$1$$anonfun$apply$2.apply(Memo.scala:67)
at com.nicta.scoobi.impl.rtt.ScoobiMetadata$$anonfun$metadata$1.apply(ScoobiMetadata.scala:46)
at com.nicta.scoobi.impl.rtt.ScoobiMetadata$$anonfun$metadata$1.apply(ScoobiMetadata.scala:43)
at com.nicta.scoobi.impl.rtt.TaggedMetadata$class.metaDatas(ScoobiMetadata.scala:54)
at com.nicta.scoobi.impl.rtt.MetadataTaggedKey.metaDatas$lzycompute(TaggedKey.scala:47)
at com.nicta.scoobi.impl.rtt.MetadataTaggedKey.metaDatas(TaggedKey.scala:47)
at com.nicta.scoobi.impl.rtt.TaggedMetadata$class.tags(ScoobiMetadata.scala:55)
at com.nicta.scoobi.impl.rtt.MetadataTaggedKey.tags$lzycompute(TaggedKey.scala:47)
at com.nicta.scoobi.impl.rtt.MetadataTaggedKey.tags(TaggedKey.scala:47)
at com.nicta.scoobi.impl.rtt.MetadataTaggedWritable$class.readFields(Tagged.scala:70)
at com.nicta.scoobi.impl.rtt.MetadataTaggedKey.readFields(TaggedKey.scala:47)
at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:97)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:1115)
at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:95)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:59)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1403)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:857)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1348)

Alex Cozzi

unread,
Aug 14, 2013, 5:33:06 PM8/14/13
to scoobi...@googlegroups.com
I think that the problem is that we get two ScoobiConfiguration object being instantiated on the mapper: the job tries to pull the same metadata, but the temp directory does not match, hence the lookup fails.

Alex Cozzi

unread,
Aug 14, 2013, 5:34:09 PM8/14/13
to scoobi...@googlegroups.com
and, by the way, the job fails also when running in local mode: for ease of testing you can just try to run WordCount in local mode and track down the failure, you do not need to run on the cluster

Eric Torreborre

unread,
Aug 15, 2013, 12:19:25 AM8/15/13
to scoobi...@googlegroups.com
Hi Alex,

I've been fighting Hadoop like hell to go around the fact that we need to access cached files when instantiating some classes for Scoobi. The latest 0.8.0-SNAPSHOT looks better but I'm going to reserve my judgment until all the tests pass. If this is the case on our Jenkins server then, a cdh3 version should also be available shortly after.

E.

Alex Cozzi

unread,
Aug 15, 2013, 12:40:39 AM8/15/13
to scoobi...@googlegroups.com
Absolutely, 
what I wanted to say is that running WordCount with latest 0.8.0-SNAPSHOT appear to be broken, so yes, please do not publish the cdh3 version yet. the latest 0.8.0-SNAPSHOT is  surely an improvement: now part of the metadata is correctly pulled from local files, but it appears that the changes broke something.
Again, thank you for your help.
Alex

Alex Cozzi

unread,
Aug 15, 2013, 1:04:21 AM8/15/13
to scoobi...@googlegroups.com
Hi Eric,
I tested the latest commit (https://github.com/NICTA/scoobi/commit/88eedc20e0f06d934b41de353130a3993b1ead11) and it seems to have fixed the problems. I am running a big job right now on the cluster, but it looks good so far!
will let you know how it goes.
Thanks

Eric Torreborre

unread,
Aug 15, 2013, 3:17:35 AM8/15/13
to scoobi...@googlegroups.com
Yes, things finally look better. I'm still waiting for our cluster tests to finish and hopefully everything will be fine.

Alex Cozzi

unread,
Aug 15, 2013, 8:05:14 PM8/15/13
to scoobi...@googlegroups.com
I think that the problem is fixed, at least our jobs seems to be running correctly. 
I wonder whether they also run more quickly now, or they just do not hammer the namenode with thousands of requests for the same file...
Alex

Alex Cozzi

unread,
Aug 15, 2013, 8:07:32 PM8/15/13
to scoobi...@googlegroups.com
Oh, small nitpick: the jobs's name now are just "mscr1"  "mscr2" etc etc, which are a little less descriptive than before, but I will take this over a non-working job every day.
Thank you again for your responsiveness and commitment, highly appreciated!
Alex

Eric Torreborre

unread,
Aug 15, 2013, 8:10:53 PM8/15/13
to scoobi...@googlegroups.com
Ok, that's a consequence of reducing the size of the paths for metadata. I'll try to fix that today so that the job name gets back its old descriptive full name.

Eric Torreborre

unread,
Aug 15, 2013, 8:11:41 PM8/15/13
to scoobi...@googlegroups.com
Indeed, not descriptive at all :-)

job_201303161951_42419NORMALetorreborremscr14SUCCEEDEDFri Aug 16 10:07:05 EST 2013Fri Aug 16 10:07:44 EST 2013100.00%
100.00%
NANA
job_201303161951_42416NORMALetorreborremscr47SUCCEEDEDFri Aug 16 10:06:33 EST 2013Fri Aug 16 10:07:30 EST 2013100.00%
100.00%
NANA
job_201303161951_42418NORMALetorreborremscr48SUCCEEDEDFri Aug 16 10:06:38 EST 2013Fri Aug 16 10:07:18 EST 2013100.00%
100.00%
NANA
job_201303161951_42417NORMALetorreborremscr51SUCCEEDEDFri Aug 16 10:06:35 EST 2013Fri Aug 16 10:07:13 EST 2013100.00%
100.00%
NANA
job_201303161951_42413NORMALetorreborremscr45SUCCEEDEDFri Aug 16 10:06:13 EST 2013Fri Aug 16 10:07:05 EST 2013100.00%
100.00%
NANA
job_201303161951_42412NORMALetorreborremscr44SUCCEEDEDFri Aug 16 10:05:59 EST 2013Fri Aug 16 10:06:58 EST 2013100.00%
100.00%
NANA
job_201303161951_42414NORMALetorreborremscr13SUCCEEDEDFri Aug 16 10:06:14 EST 2013Fri Aug 16 10:06:57 EST 2013100.00%
100.00%
NANA

Eric Torreborre

unread,
Aug 15, 2013, 9:07:52 PM8/15/13
to scoobi...@googlegroups.com
I did the change now. The new job names are of the form:

SimpleDListsSpec-0816-105514-1086864328-mscr1

Where: 

 - SimpleDListsSpec is the application name (which you can set on the ScoobiConfiguration object with the jobNameIs method)
 - 0816-105514 is the date/time (I removed the year)
 - 1086864328 is the hashcode of a UUID. This is shorter of the UUID we had before and I guess that this should be enough to ensure proper isolation of tests when running lots of them from the same class
 - mscr1 is the id of the exact map reduce job being executed for a given Scoobi job

I'm open to any suggestions if you think that something else makes sense from an ops point of view.

E.

Alex Cozzi

unread,
Aug 15, 2013, 11:49:06 PM8/15/13
to scoobi...@googlegroups.com
looks great. The only thing I think is missing is an indication of how far is the job, i.e., something like Step 5/13 for a 13-step job. is mscr=step?  If so I'd just call it "step" instead.

Eric Torreborre

unread,
Aug 16, 2013, 8:22:14 AM8/16/13
to scoobi...@googlegroups.com
Great ideas, I'll do that.
Reply all
Reply to author
Forward
0 new messages