PublishDir warning - Failed to publish file - published regardless.

687 views
Skip to first unread message

Paul

unread,
May 28, 2020, 10:17:40 AM5/28/20
to Nextflow
Dear all,

version 20.04.1 build 5335

Using a nextflow workflow I've run into a problem where a warning is displayed on the terminal, something along the lines of 

WARN  nextflow.processor.PublishDir - Failed to publish file

The full error:

May-27 18:03:35.984 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[jobId: 54393; id: 5; name: processHumann2Regroup (shrimpsauce_evaluation); status: COMPLETED; exit: 0; error: -; workDir: /mnt/nfs/testfolder/paul/Testing_metatranscriptomics/rundir/standard_test_shrimpsauce/work/16/5a800fd23db937e0ae8396f0a0a2d3 started: 1590595285974; exited: 2020-05-27T16:03:10.127535Z; ]
May-27 18:03:36.014 [FileTransfer-thread-22] WARN  nextflow.processor.PublishDir - Failed to publish file: /mnt/nfs/testfolder/paul/Testing_metatranscriptomics/rundir/standard_test_shrimpsauce/work/16/5a800fd23db937e0ae8396f0a0a2d3/shrimpsauce_evaluation_eggnog.txt; to: /mnt/nfs/testfolder/paul/Testing_metatranscriptomics/rundir/standard_test_shrimpsauce/metatranscriptome_analysis_270520_1608/shrimpsauce_evaluation/01_alignment/humann2/eggnog/shrimpsauce_evaluation_eggnog.txt [copy] -- See log file for details
java.nio.file.NoSuchFileException: /mnt/nfs/testfolder/paul/Testing_metatranscriptomics/rundir/standard_test_shrimpsauce/metatranscriptome_analysis_270520_1608/shrimpsauce_evaluation/01_alignment/humann2/eggnog/shrimpsauce_evaluation_eggnog.txt
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.implDelete(UnixFileSystemProvider.java:244)
        at sun.nio.fs.AbstractFileSystemProvider.delete(AbstractFileSystemProvider.java:103)
        at java.nio.file.Files.delete(Files.java:1126)
        at nextflow.file.FileHelper.deletePath(FileHelper.groovy:901)
        at nextflow.processor.PublishDir.processFile(PublishDir.groovy:279)
        at nextflow.processor.PublishDir.safeProcessFile(PublishDir.groovy:259)
        at sun.reflect.GeneratedMethodAccessor191.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:101)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1217)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041)
        at org.codehaus.groovy.runtime.InvokerHelper.invokePogoMethod(InvokerHelper.java:1011)
        at org.codehaus.groovy.runtime.InvokerHelper.invokeMethod(InvokerHelper.java:994)
        at org.codehaus.groovy.runtime.InvokerHelper.invokeMethodSafe(InvokerHelper.java:97)
        at nextflow.processor.PublishDir$_apply1_closure1.doCall(PublishDir.groovy:232)
        at nextflow.processor.PublishDir$_apply1_closure1.call(PublishDir.groovy)
        at groovy.lang.Closure.run(Closure.java:486)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)


The strange thing is, the file seems to have published just fine. It's in the workdir, it's in the publish dir and they're identical. Using nextflow version 19.04.1 I don't get this warning.

Perhaps files being assigned as output twice causes this issue? The process creating the issue:

process processHumann2Regroup {
tag "${pair_id}"
publishDir "${outdir_datetime}/${pair_id}/01_alignment/humann2/kegg_pathway", mode: 'copy', pattern: "*_kegg*.txt" //
publishDir "${outdir_datetime}/${pair_id}/01_alignment/humann2/eggnog", mode: 'copy', pattern: "*_eggnog*.txt"
publishDir "${outdir_datetime}/${pair_id}/01_alignment/humann2/pfam", mode: 'copy', pattern: "*_pfam*.txt"
publishDir "${outdir_datetime}/${pair_id}/01_alignment/humann2/goterms", mode: 'copy', pattern: "*_goterms*.txt"
publishDir "${outdir_datetime}/${pair_id}/01_alignment/humann2/infogo", mode: 'copy', pattern: "*_infogoterms*.txt"
publishDir "${outdir_datetime}/${pair_id}/01_alignment/humann2/level4ec", mode: 'copy', pattern: "*_level4ec*.txt"
publishDir "${outdir_datetime}/${pair_id}/01_alignment/humann2/ko_pathway", mode: 'copy', pattern: "*_kopath*.{txt, tsv}"
publishDir "${outdir_datetime}/${pair_id}/01_alignment/humann2/kegg_module", mode: 'copy', pattern: "*_modules*.{txt, tsv}"

input:
set val(pair_id), file(concat), file(input_r1), file(input_r2), file(gene), file(unaligned) from out_runHumann2
output:
file "*"
set val(pair_id), file("${pair_id}_eggnog.txt") into out_humann2_eggnog_ch2
set val(pair_id), file("${pair_id}_eggnog.txt"), file("${pair_id}_kegg.txt") into out_humann2_eggnog_kegg_ch
set val(pair_id), file(concat), file(input_r1), file(input_r2), file(gene), file(unaligned), file("${pair_id}_eggnog.txt"), file("${pair_id}_kegg.txt") into out_processHumann2Regroup

script:
"""
stuff
"""
}

Paolo Di Tommaso

unread,
May 31, 2020, 6:00:50 AM5/31/20
to nextflow
It could be caused by having the same file published in the same target folder two or more times. Since the copying is parallel this can create such error. 

Hope it helps 


--
You received this message because you are subscribed to the Google Groups "Nextflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nextflow/bf1101f3-e0ea-4aa5-974b-e78e53696066%40googlegroups.com.

MB

unread,
May 31, 2020, 6:24:26 AM5/31/20
to Nextflow
Hey Paolo,

Thank you for the clarification, we were having the same issue.
However, sometimes we need the same file in separate output channel as input for different channels. We could ofcourse make one output channel and manipulate that channel into separate smaller channels but this greatly reduces the readability of the pipeline and isn't the intended use of the output channels I think.

Could this maybe be "fixed" in a future version of NextFlow? Maybe pool all output files, making sure that duplicate files are ignore and then passing it on to the publishdir directive?

Op zondag 31 mei 2020 12:00:50 UTC+2 schreef Paolo Di Tommaso:
To unsubscribe from this group and stop receiving emails from it, send an email to next...@googlegroups.com.

Paolo Di Tommaso

unread,
May 31, 2020, 11:01:58 AM5/31/20
to nextflow
Please report an issue with a replicable test case. Tx! 


 p

To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nextflow/dd8f3cb3-32a2-424d-8d42-b55b9c16e61c%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages