PEAR: Entry is missing, Aborted (core dumped)

CH

unread,

May 12, 2015, 5:25:10 AM5/12/15

to pear-...@googlegroups.com

Hi,I got the following errors with the latest version of PEAR.

$ ~/Downloads/PEAR/bin/pear-0.9.6-bin-64 -f ./output/s1a.trim.fastq -r ./output/s2a.trim.fastq -o ./test_pear.r1r2.fastq
____ _____    _    ____
| _ \| ____| / \ | _ \
| |_) | _|   / _ \ | |_) |
| __/| |___ / ___ \| _ <
|_|   |_____/_/   \_\_| \
PEAR v0.9.6 [January 15, 2015]

Forward reads file.................: ./output/s1a.trim.fastq
Reverse reads file.................: ./output/s2a.trim.fastq
PHRED..............................: 33
Using empirical frequencies........: YES
Statistical method.................: OES
Maximum assembly length............: 999999
Minimum assembly length............: 50
p-value............................: 0.010000
Quality score threshold (trimming).: 0
Minimum read size after trimming...: 1
Maximal ratio of uncalled bases....: 1.000000
Minimum overlap....................: 10
Scoring method.....................: Scaled score
Threads............................: 1

Allocating memory..................: 200,000,000 bytes
Computing empirical frequencies....: Entry is missing
Aborted (core dumped)

What does "Entry is missing" mean? Can anyone give me some suggestions to solve this problem? Thank you.

Tomas Flouri

unread,

May 18, 2015, 4:10:13 AM5/18/15

to CH, pear-...@googlegroups.com

Hello,

most probably it is that you performed some kind of quality filtering on your dataset and you have two files (forward and reverse) that contain a different number of reads.
Could that be the case?

Tomas

--
You received this message because you are subscribed to the Google Groups "pear-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pear-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

CH

unread,

May 18, 2015, 10:22:35 PM5/18/15

to pear-...@googlegroups.com, chn...@gmail.com

Hi, Tomas

I have tried what you have suggested using wc-l file, and the two files (forward and reverse) contain the same number of reads.

-Lenovo-Z40-70:~/Desktop/trim_output$ wc -l trim3424_S1_L001_R1_001.fastq
1384564 trim3424_S1_L001_R1_001.fastq

Lenovo-Z40-70:~/Desktop/trim_output$ wc -l trim3424_S1_L001_R2_001.fastq
1384564 trim3424_S1_L001_R2_001.fastq

Any other suggestions?

CH

Tomas Flouri

unread,

May 21, 2015, 7:23:51 AM5/21/15

to CH, pear-...@googlegroups.com

Hello CH,

I've written some explanations in the GitHub issues, I'll repeat it here for the broader audience.

One possible problem could be that after trimming your sequences you ended up with a FASTQ record whose sequence is of length 0. This could cause the problem you are describing.
Another possible issue is that the number of reads in the two files (forward and reverse) do not match, but you have already tested that.

The last problem is that you could be having a FASTQ record whose third line has more characters than just '+'. Unfortunately the parsing procedure of PEAR checks that the third line is strictly '+'. If it is not then it could return this error. I will modify the code such that it accepts additional optional information after the +.

Therefore, if you could try the following, it would help me find where the problem is:

head -n 100 forward.fa > forward-25.fa
head -n 100 reverse.fa > reverse-25.fa

which will create a reduced dataset that contains only the 25 first reads. Try then running PEAR on this reduced dataset and check whether the problem persists. If it does, then could you send me the reduced dataset such that I can debug?

Thanks and best regards,
Tomas

CH

unread,

May 22, 2015, 2:47:19 AM5/22/15

to pear-...@googlegroups.com, chn...@gmail.com

Hi. Tomas.

We finally settled the issue. The problem came from cutadapt. I should run the command in cut adapt with "-- minimum-length 50". I truly appreciate your help in resolving the problem.

Reply all

Reply to author

Forward