Hello CH,
I've written some explanations in the GitHub issues, I'll repeat it
here for the broader audience.
One possible problem could be that after trimming your sequences you
ended up with a FASTQ record whose sequence is of length 0. This
could cause the problem you are describing.
Another possible issue is that the number of reads in the two files
(forward and reverse) do not match, but you have already tested
that.
The last problem is that you could be having a FASTQ record whose
third line has more characters than just '+'. Unfortunately the
parsing procedure of PEAR checks that the third line is strictly
'+'. If it is not then it could return this error. I will modify the
code such that it accepts additional optional information after the
+.
Therefore, if you could try the following, it would help me find
where the problem is:
head -n 100 forward.fa > forward-25.fa
head -n 100 reverse.fa > reverse-25.fa
which will create a reduced dataset that contains only the 25 first
reads. Try then running PEAR on this reduced dataset and check
whether the problem persists. If it does, then could you send me the
reduced dataset such that I can debug?
Thanks and best regards,
Tomas