Hi -
I'm assembling an invertebrate genome of ~500Mb using SOAPdenovo2. For the scaffolding, I have 3 mate-pair libraries (~3kb, ~5kb and ~6kb). The scaffolding stage goes though but the logfile shows worrying error messages which seems to indicate that 2 of the mate-pair libraries are rejected because of 'Too few PE links' (see attach extract). The third one seems ok and gives a decent scaffolding. However I'm puzzled why these 2 libraries didn't pass the filter.
I independently mapped the mate-pair to contigs and compute the size distribution to check how look these libraries (see graph below). From my understanding of how SOAP works, it starts to gather all connections on contigs larger that the library average size indicated and used them for redefine library average size. From the distribution, there should be enough of such connections, at least for the library of 3kb to satisfy this criterion. I'd like to better understand which filters the PE link have to pass! I tried to play with the indicated average size in the config file to see if it was changing something but it didn't.
I appreciate your help and feedback on that issue. Maybe my libraries are not indeed bad but I'd then like to understand what is the exact problem.
Ferdi
For insert size: 3250
Total PE links 23061301
Normal PE links on same contig 96
Incorrect oriented PE links 19
PE links of too small insert size 1371194
PE links of too large insert size 0
Correct PE links 21689741
Accumulated connections 43379084
Use contigs longer than 3250 to estimate insert size:
PE links 77
Too few PE links.
21689542 new connections.
For insert size: 5000
Total PE links 24097524
Normal PE links on same contig 152439
Incorrect oriented PE links 1085
PE links of too small insert size 115289
PE links of too large insert size 0
Correct PE links 22131611
Accumulated connections 13431114
Use contigs longer than 5000 to estimate insert size:
PE links 97055
Average insert size 5322
SD 834
6715557 new connections.
For insert size: 5700
Total PE links 21348956
Normal PE links on same contig 20
Incorrect oriented PE links 22
PE links of too small insert size 181047
PE links of too large insert size 0
Correct PE links 21167615
Accumulated connections 42334634
Use contigs longer than 5700 to estimate insert size:
PE links 7
Too few PE links.
21167317 new connections.