Hi,
I have been using NGLess in my metagenomics pipeline for several projects. As a part of my NGLess script, I write the post-processing readset to fq.gz with the following code:
input = as_reads(mapped)
write(input, ofile=sample+'/'+sample+'_reads.fq.gz')
Previously, this has resulted in 2 fq.gz files for each sample: _reads.pair.1.fq.gz, and _reads.pair.2.fq.gz.
This most recent time I have ran NGLess (v1.4.2), I have only 1 processed file for each sample: _reads.fq.gz.
I don't believe I have changed my NGLess version or anything between these runs. What do you think may be causing this discrepancy? How can I ensure that I produce 2 processed reads per sample?
My only thought is that I had one single sample which only had 1 of the raw reads. Perhaps NGLess saw this and reverted to only generating 1 read per sample, even for the samples which contained both? This seems unlikely to me as I thought all runs were being conducted independently.
Lastly; am I fine to simply proceed with this single read for generating SAM files and read counts? I will have to retool some of the commands in my pipeline to accommodate this but I don't think it will pose a problem. I would prefer to re-run NGLess and generate both reads if possible, even if it means throwing out the sample with the missing 2nd raw read - not sure how that happened.
Thanks in advance for all thoughts and help! Let me know if I can provide any additional information.
Josh