It did help, thanks!
Your guess is probably right, and re-using the variables which already have special meaning/use in bpipe is a not a good idea and it was screwing the system - my bad. I also initially misunderstood Marc's suggestion and could not get it work. Anyway, I am leaving here the final working script for future reference:
def sra_file="SRP024271_description.tsv"
def regions = []
new File(sra_file).eachLine{ line ->
regions << line.split("\t")[3]
fastq-dump --split-files --gzip --outdir $output.dir $sra_id
"""
}
}
run {
regions * [ DownloadSRA_se ]
}
The file SRP024271_description.tsv was obtained using the bioconductor package SRAdb and contains the following info (top entries):
SRP024271 SRS437906 SRX297174 SRR886455 (+)4SUTP_128cell_rep2 RNA-Seq GSM1155230: (+)4SUTP_128cell_rep2; Danio rerio; RNA-Seq SINGLE -
SRP024271 SRS437904 SRX297172 SRR886449 (+)4SUTP_256cell_rep1 RNA-Seq GSM1155228: (+)4SUTP_256cell_rep1; Danio rerio; RNA-Seq SINGLE -
SRP024271 SRS437904 SRX297172 SRR886451 (+)4SUTP_256cell_rep1 RNA-Seq GSM1155228: (+)4SUTP_256cell_rep1; Danio rerio; RNA-Seq SINGLE -
Let add that even though bpipe is geared towards files as input/output, the ability to start with parameters read from files, or just an SRA accession ID, is quite useful for many things. My use case is, I think, typical: run X pipeline on publicly available data. Usually I would do this in 4 semi-automated but independent steps:
- get the metadata for the relevant datasets
- download the SRAs;
- merge and rename
- run the pipeline.
With the above code is now possible to go straight from step 1 to 4. So yay :)
Thanks again,
António