Hi there,
First I create a list of IDs for a given oranism and send the results as a text file into a channel.
process getRNASeqIDs {
output:
file 'ids.txt' into RNASeqIDs
"""
esearch -db taxonomy -query '${params.ref}' \
| elink -target sra \
| efetch -format docusm \
| xtract -pattern EXPERIMENT_PACKAGE \
-if LIBRARY_STRATEGY \
-equals 'RNA-Seq' \
-group RUN \
-element @accession > ids.txt
"""
}
The results text file looks like this:
SRR5040508
SRR5040506
SRR5040505
SRR3999595
Each line represents the ID of an RNAseq dataset.
I then split the channel line by line like this
ids = RNASeqIDs.splitText()
I then want to feed each ID into fastq-dump to download the dataset.
process dumpfastq {
tag { id }
input:
val id from ids
output:
set id, "*.fastq" into fastqDumpForAlignment
"""
fastq-dump $id
"""
}
The problem however is that each ID gets delivered to fastq-dump together with a new line character and fastq-dump crashes because of that.
Is there a better way to parse a list of values in order to feed them individually into another process?
Cheers,
Johannes