Missing output file(s) expected by process (dsl2)

797 views
Skip to first unread message

Jong Ha Shin

unread,
Jan 31, 2022, 5:31:15 PM1/31/22
to Nextflow
Hi,

I am new user in nextflow, and I am having a problems to run it.
It would be so grateful if someone can give me an insight.

So what I am doing in here is running bcl2fastq in nextflow and pass the result fastq.gz files to fastqc process. 
However, I keep getting an error message that missing output files even though I can see that files were generated correctly after bcl2fastq.

-------------------------------------------------------------------------------------------------------------
Error message:
[cf/0f2116] process > bcl2fastq [100%] 1 of 1, failed: 1 ✘
[-        ] process > fastqc    -
Something when wrong, look up the log
Error executing process > 'bcl2fastq'

Caused by:
  Missing output file(s) `*.fastq.gz` expected by process `bcl2fastq`
-------------------------------------------------------------------------------------------------------------
Nextflow script:

nextflow.enable.dsl=2

$params.seq_dir = some directory (/net/vol9/seqdir/BRCA1/nameofseqdir)
$params.out_dir = some directory (/net/vol1/data/nextflow_cluster_test/nobackup)
$params.sample_sheet = some directory = (/net/vol1/data/samplesheet.csv)
(These three parameters are defined in params.config)

process bcl2fastq {
    publishDir "$params.out_dir/bcl2fastq/", mode: 'copy'
    echo "bcl2fastq is running"

    output:
        path "*.fastq.gz", emit: bcl2
    """
    #!/usr/bin/env bash
    bcl2fastq \
        -R $params.seq_dir -o $params.out_dir/bcl2fastq \
        --sample-sheet $params.sample_sheet --interop-dir $params.out_dir/bcl2fastq \
        --no-lane-splitting --use-bases-mask Y*,I*,I*,Y* \
        --minimum-trimmed-read-length 0 --mask-short-adapter-reads 0
    """
}
process fastqc {
    publishDir "$params.out_dir/bcl2fastq/fastqc_out", mode: 'copy'
   
    input:
        path bcl2
    """
    fastqc \
        $bcl2 -o $params.out_dir/bcl2fastq/fastqc_out -t 4
    """
}
workflow {
    main:
        bcl2fastq()
        fastqc(bcl2fastq.out)
}
-------------------------------------------------------------------------------------------------------------

drhp...@gmail.com

unread,
Feb 1, 2022, 6:40:58 AM2/1/22
to Nextflow

Hi Jong,

It looks like the FastQ files in the bcl2fastq process are created in a sub-directory () which is why NF can't find them?

I have fixed some stuff in your example and created a prototype below. In general you shouldn't use params.out_dir in the script section of the module because where the process runs and where the results are published can be completely independent. I have also provided explict input options for the seq_dir and samplesheet because these will need to be staged properly by NF as inputs. You may have to play with this a little to get it working but it should be pretty much there.

process bcl2fastq {
    publishDir "$params.out_dir/bcl2fastq/", mode: 'copy'

    input:
    path seq_dir
    path sample_sheet

    output:
    path "bcl2fastq/*.fastq.gz", emit: bcl2

    """
    bcl2fastq \
        -R $seq_dir \\
        -o ./bcl2fastq \\
        --sample-sheet $sample_sheet \\
        --interop-dir ./bcl2fastq \\
        --no-lane-splitting \\
        --use-bases-mask Y*,I*,I*,Y* \\
        --minimum-trimmed-read-length 0 \\
        --mask-short-adapter-reads 0
    """
}
process fastqc {
    publishDir "$params.out_dir/bcl2fastq/fastqc_out", mode: 'copy'
   
    input:
    path bcl2

    """
    fastqc \
        $bcl2 -o ./fastqc -t 4
    """
}

workflow {
    main:
        bcl2fastq (
            params.seq_dir,
            params.sample_sheet
        )

        fastqc (
            bcl2fastq.out.bcf
        )
}

Jong Ha Shin

unread,
Feb 3, 2022, 2:26:42 PM2/3/22
to Nextflow
Thank you,

It helped me to solve the problem!

Reply all
Reply to author
Forward
0 new messages