Handling of output Files

1,762 views

Skip to first unread message

clw.ge...@gmail.com

unread,

Nov 26, 2014, 2:54:05 PM11/26/14

to next...@googlegroups.com

Hi Everyone,

I'm just getting into Nextflow (started today), so excuse me if my question is a little naive, but I have a question about the handling of output files.

When running a pipeline like this:

myDir = "./"
params.folder = file(myDir)

process read_data {
  output:
  file "${dir}/bla*" into read

  script:
  dir = params.folder

  """
  echo bla > ${dir}/bla.txt
  """

}


process print {
  input:
  file(x) from read
  output:
  stdout recieve

  script:

  """
  cat $x
  """
}

recieve.subscribe { println it }

I get the error

missing output file(s): '/path/to/dir/bla*' expected by process: read_data (1)

even though the file /path/to/dir/bla.txt was created.

If I remove the ${dir} from the code, the file gets put into the standard "work/XX/XXXXXXXXXXXXX" folder and everything works.

I suspect nextflow needs the files in its native structure to be able to handle channels appropriately and thats why one can't just put the files anywhere?

Is there then a way to automatically link the files in the native file structure to somewhere else, so that intermediate files can be handled easier, or does one just have to work on the file structure as it is build up natively?

Paolo Di Tommaso

unread,

Nov 26, 2014, 4:33:16 PM11/26/14

to nextflow

Nextflow is designed in such a way that you do need to organise your intermediate files in a directory structure.

For three reasons:

1) Simplify your work: think in term of files, not paths or directories;

2) Avoid race conditions when your jobs are executed in parallel manner;

3) Allowing you to resume the pipeline execution from the last successful executed if it stops for any reason.

So, you are right. You don't have to use the ${dir} variable to force a process to write the files in that folder.

If you need to copy some result in a specify place you can do that outside the process scope, for example:

params.folder = "./"

myDir = file(params.folder)
myDir.mkdirs()

process read_data {
output:
file "bla*" into read

"""
echo bla > bla.txt
"""

}

read.subscribe { it.copyTo(myDir) }