merging FASTQ files

16 views
Skip to first unread message

Mic L

unread,
Feb 14, 2016, 9:12:26 PM2/14/16
to ruffus_discuss

Hi,
I have the following file structure with FASTQ files:


    ~/Downloads/kumaran/reads > tree
    ├── C5-B1-32237239
    │   ├── C5-B1_S5_L001_R1_001.fastq.gz
    │   ├── C5-B1_S5_L001_R2_001.fastq.gz
    │   ├── C5-B1_S5_L002_R1_001.fastq.gz
    │   ├── C5-B1_S5_L002_R2_001.fastq.gz
    │   ├── C5-B1_S5_L003_R1_001.fastq.gz
    │   ├── C5-B1_S5_L003_R2_001.fastq.gz
    │   ├── C5-B1_S5_L004_R1_001.fastq.gz
    │   └── C5-B1_S5_L004_R2_001.fastq.gz
    ├── C6-B1-32241247
    │   ├── C6-B1_S2_L001_R1_001.fastq.gz
    │   ├── C6-B1_S2_L001_R2_001.fastq.gz
    │   ├── C6-B1_S2_L002_R1_001.fastq.gz
    │   ├── C6-B1_S2_L002_R2_001.fastq.gz
    │   ├── C6-B1_S2_L003_R1_001.fastq.gz
    │   ├── C6-B1_S2_L003_R2_001.fastq.gz
    │   ├── C6-B1_S2_L004_R1_001.fastq.gz
    │   └── C6-B1_S2_L004_R2_001.fastq.gz

I wrote the following Ruffus method:


    from ruffus import *
    from collections import defaultdict
    from glob import glob
    import os

    fq_folders = glob("/home/galaxy/Downloads/kumaran/reads/*")

    sambam_dir = "/home/galaxy/Downloads/kumaran/alignments"
    fastq_metadata = defaultdict(dict)
    ref = "/home/galaxy/Downloads/kumaran/Bactrocera_tryoni"

    def make_rg(metadata):
        return " --rg-id %s --rg SM:%s --rg LB:%s --rg PL:ILLUMINA" % (metadata, metadata, metadata)

    @transform(fq_folders, filter=suffix(".fastq"), output=(".sorted.bam"))
    def run_bwa(fq_folder, output):
        combine_R1_fq = fq_folder + "/R1.fastq.gz"
        combine_R2_fq = fq_folder + "/R2.fastq.gz"
        cmd = "cat %s > %s" % (fq_folder + "/*R1*.fastq.gz", combine_R1_fq)
        print cmd
        cmd = "cat %s > %s" % (fq_folder + "/*R2*.fastq.gz", combine_R2_fq)
        print cmd
        rg = make_rg(os.path.basename(os.path.splitext(os.path.splitext(output)[0])[0]))

        cmd = "bowtie2 -x %s -1 %s -2 %s %s | samtools view -bS - | samtools sort - %s" \
          % (ref, combine_R1_fq, combine_R2_fq, rg, output)
        print cmd

        cmd = "rm %s %s" %(combine_R1_fq, combine_R2_fq)
        print cmd


However, I got the following error:


  WARNING:
        'In Task
  @transform(...)
  def run_bwa(...):
      ...

  No jobs were run because no file names matched.
  Please make sure that the regular expression is correctly specified.

What did I do wrong?

Thank you in advance.

Mic

Reply all
Reply to author
Forward
0 new messages