Hi there,
I was trying to use PRINSEQ for my metagenomic analysis to do raw reads dereplication for the exact replicate only. I tried both the web interface and the stand alone version, but they were not working well for me. I wonder if I can get some tips here. The details are as below, thank you in advance.
Web interface:
I was trying to upload 24 sets of paired-end fastq.gz files to the web interface. Some of they files just could not be uploaded and return the error code: "True" or "end by extension", which I don't understand what that means and how to solve it. Some files were uploaded successfully but takes very long time to show the graphs report. Here is an example:
![]()
And the only file which has generated the html report, ID: 31363031393137353439. I couldn't download the report file once it has been generated. It's been showing Failed for downloading. I wonder why and how to fix it.
Then, with the only succeed case, I put it to process the data with filter option set to remove the exact duplicates. However when I download the 'BR2_1_forward_paired_balancing30_5_L.fq_good_2.fastq.gz', the file could not be input to Fastqc (I want to compare the duplication rate before and after deduplication), the error showed was not a fastq.gz file. I wonder what is wrong and how can I fix it.
Stand-alone version:
Since I have so many set of paired-end data, I'd like to run PRINSEQ on cluster of my institute. But I couldnt work out the command lines to work for my data. Can I get some help here? My data are basically in the format of ''2_1_forward_paired_L.fq.gz" and "2_1_reverse_paired_R.fq.gz"for the forward and reverse ends. And I'd like to generate the duplication report and remove all the exact duplicates in the samples.
Thank you in advance.
Best,
Shuqi