Hi,
I noticed that no matter what I set "overwrite" to, it always overwrites for TextOutput, but it works ok for sequence files and I think Avro files. For some reason, when I am using TextOutput, what ends up getting checked is the temporary output, which then later gets copied to the final output path, so of course, it will always "overwrite".
I just added some print statements to see what is going on, but I am having trouble finding out why the tmp path is getting passed to the sink in one case, but not the other.
Logs for sequence file output:
[INFO] ScoobiApp - the URL of Java (evidenced with the java.lang.String class) is jar:file:/usr/lib/jvm/java-6-sun-1.6.0.21/jre/lib/rt.jar!/java/lang/String.class
[INFO] ScoobiApp - the URL of Scala (evidenced with the scala.collection.immutable.Range class) is jar:file:/home/elliot/.sbt/boot/scala-2.10.1/lib/scala-library.jar!/scala/collection/immutable/Range.class
[INFO] ScoobiApp - the URL of Hadoop (evidenced with the org.apache.hadoop.io.Writable class) is jar:file:/home/elliot/.ivy2/cache/org.apache.hadoop/hadoop-common/jars/hadoop-common-2.0.0-cdh4.0.1.jar!/org/apache/hadoop/io/Writable.class
[INFO] ScoobiApp - the URL of Avro (evidenced with the org.apache.avro.Schema class) is jar:file:/home/elliot/.ivy2/cache/org.apache.avro/avro/jars/avro-1.7.4.jar!/org/apache/avro/Schema.class
[INFO] ScoobiApp - the URL of Kiama (evidenced with the org.kiama.rewriting.Rewriter class) is jar:file:/home/elliot/.ivy2/cache/com.googlecode.kiama/kiama_2.10/jars/kiama_2.10-1.5.0-SNAPSHOT.jar!/org/kiama/rewriting/Rewriter.class
[INFO] ScoobiApp - the URL of Scoobi (evidenced with the com.nicta.scoobi.core.ScoobiConfiguration class) is file:/media/data/Documents/code/git/scoobi/target/scala-2.10/classes/com/nicta/scoobi/core/ScoobiConfiguration.class
**** In ExecutionMode.checkSourceAndSinks ****
output path: Some(test)
***** In SeqSink.outputCheck *****
output: test
output exists: true
overwrite: false
...
Logs for text file output:
[info] Running com.nicta.scoobi.io.text.Foo txt 0
[INFO] ScoobiApp - the URL of Java (evidenced with the java.lang.String class) is jar:file:/usr/lib/jvm/java-6-sun-1.6.0.21/jre/lib/rt.jar!/java/lang/String.class
[INFO] ScoobiApp - the URL of Scala (evidenced with the scala.collection.immutable.Range class) is jar:file:/home/elliot/.sbt/boot/scala-2.10.1/lib/scala-library.jar!/scala/collection/immutable/Range.class
[INFO] ScoobiApp - the URL of Hadoop (evidenced with the org.apache.hadoop.io.Writable class) is jar:file:/home/elliot/.ivy2/cache/org.apache.hadoop/hadoop-common/jars/hadoop-common-2.0.0-cdh4.0.1.jar!/org/apache/hadoop/io/Writable.class
[INFO] ScoobiApp - the URL of Avro (evidenced with the org.apache.avro.Schema class) is jar:file:/home/elliot/.ivy2/cache/org.apache.avro/avro/jars/avro-1.7.4.jar!/org/apache/avro/Schema.class
[INFO] ScoobiApp - the URL of Kiama (evidenced with the org.kiama.rewriting.Rewriter class) is jar:file:/home/elliot/.ivy2/cache/com.googlecode.kiama/kiama_2.10/jars/kiama_2.10-1.5.0-SNAPSHOT.jar!/org/kiama/rewriting/Rewriter.class
[INFO] ScoobiApp - the URL of Scoobi (evidenced with the com.nicta.scoobi.core.ScoobiConfiguration class) is file:/media/data/Documents/code/git/scoobi/target/scala-2.10/classes/com/nicta/scoobi/core/ScoobiConfiguration.class
**** In ExecutionMode.checkSourceAndSinks ****
output path: Some(/tmp/scoobi-elliot/scoobi-20130423-231331-Foo$-970d7dd3-5a9f-482d-9f9d-130d8aaad7c5/bridges/4bc386fe-b390-42ad-b466-72508da6f571)
[INFO] HadoopMode - Executing layers
Layer(1
ParallelDo (9)[Traversable[Byte]->B,(Int,String),(Unit,Unit)] (bridge 73b4a) [sinks: Some(test)])
...