Contigs and scaffolds are identical

43 views
Skip to first unread message

Joy Ding

unread,
Apr 17, 2020, 11:53:58 AM4/17/20
to ABySS
Dear all, 

I have run Abyss with different kmers (31~69) using Illumina paired-end (2*150 bp) data. But the contigs.fa and scaffolds.fa are identical. Is this correct ?

I used version 2.2.4 with this command:

nohup abyss-pe k=51 v=-v name=x32 in='../trimmed/Xylem32_R1_trimmed.fq ../trimmed/Xylem32_R2_trimmed.fq' &> a32.51.oe &


abyss-fac: 

abyss-fac   x32-unitigs.fa x32-contigs.fa x32-scaffolds.fa |tee x32-stats.tab

n n:500 L50 min N75 N50 N25 E-size max sum name

2404872 9439 3962 500 536 585 693 683 6526 5884968 x32-unitigs.fa

2404832 9431 3957 500 536 585 694 687 6526 5888336 x32-contigs.fa

2404832 9431 3957 500 536 585 694 687 6526 5888336 x32-scaffolds.fa


I’ve attached the log: a32.51.oe

Can anyone tell me how to fix this problem?  

Thanks for your help, 
Joy

a32.51.oe

Lauren Coombe

unread,
Apr 17, 2020, 12:09:14 PM4/17/20
to ABySS
Hi Joy,

If you look closely at the contig vs. unitig stats, you can see that there were in fact some joins made, just not enough to impact the N50 metric. So based on that and scanning through the log you attached, I think that ABySS looks like it ran fine without any errors. You can confirm what joins were made by looking at the x32-5.path file for the contigs -- doesn't look like any joins were made in the scaffolding stage.

To try and get some more scaffolding there are a few parameters that you could play around with:
The n/N parameters control the number of pairs required to map between sequences to give evidence for building contigs and scaffolds, so you could reduce those:
  • n: minimum number of pairs required for building contigs [10]
  • N: minimum number of pairs required for building scaffolds [n]
Your unitig N50 looks fairly low so you could try reducing the minimum size of unitig/contig that can be joined:
  • s: minimum unitig size required for building contigs (bp) [1000]
  • S: minimum contig size required for building scaffolds (bp) [1000-10000]
Hope that helps, and thanks for your interest in ABySS!
Lauren

Joy Ding

unread,
Apr 20, 2020, 11:57:11 AM4/20/20
to ABySS
Hi Lauren,

Thank for your suggestion. I tryed reducing the parameters ‘n’ and ‘s’. The results are below:

n=default s=default

abyss-fac   x32-unitigs.fa x32-contigs.fa x32-scaffolds.fa |tee x32-stats.tab

n       n:500   L50     min     N75     N50     N25     E-size  max     sum     name

2404872 9439    3962    500     536     585     693     683     6526    5884968 x32-unitigs.fa

2404832 9431    3957    500     536     585     694     687     6526    5888336 x32-contigs.fa

2404832 9431    3957    500     536     585     694     687     6526    5888336 x32-scaffolds.fa


n=1 s=default

abyss-fac   x32-unitigs.fa x32-contigs.fa x32-scaffolds.fa |tee x32-stats.tab

n       n:500   L50     min     N75     N50     N25     E-size  max     sum     name

2404872 9439    3962    500     536     585     693     683     6526    5884968 x32-unitigs.fa

2404586 9388    3922    500     536     585     698     718     6526    5915049 x32-contigs.fa

2404574 9379    3913    500     536     585     698     727     6526    5914808 x32-scaffolds.fa


n=default s=200 S=200-2000

abyss-fac   x32-unitigs.fa x32-contigs.fa x32-scaffolds.fa |tee x32-stats.tab

n       n:500   L50     min     N75     N50     N25     E-size  max     sum     name

2404872 9439    3962    500     536     585     693     683     6526    5884968 x32-unitigs.fa

2397856 10640   4441    500     538     590     710     693     6526    6705875 x32-contigs.fa

2396969 10920   4518    500     540     593     722     701     6526    6936659 x32-scaffolds.fa



n=1 s=200 S=200-2000

abyss-fac   x32-unitigs.fa x32-contigs.fa x32-scaffolds.fa |tee x32-stats.tab

n       n:500   L50     min     N75     N50     N25     E-size  max     sum     name

2404872 9439    3962    500     536     585     693     683     6526    5884968 x32-unitigs.fa

2297418 19408   7699    500     549     618     796     767     8664    12.93e6 x32-contigs.fa

2287931 22310   8158    500     562     669     931     877     14250   16e6    x32-scaffolds.fa


When I reduced both of the parameters the scaffolds is longer than others. It seems that n=, s-200, s=200-2000 would be the best choice. But is it ok for this low parameters? This is my first time to assembly the data. Hope this is not a dumb question.

Thanks again for your help.
Joy

Lauren Coombe

unread,
Apr 20, 2020, 12:47:19 PM4/20/20
to ABySS
Hi Joy,

I think your minimum contig length settings look fine -- it's a bit lower than I would normally go, but I think makes sense based on the N50 of your unitigs.

I wouldn't go quite as low as `n=1`, since that means that the join can be made just using 1 read pair as evidence. I'd set the lowest 'n' to at least 2. 

You can also use a sweep of 'n' values in abyss-scaffold - it isn't as straightforward when you're not using mate pair data, but for your case (based on the sample command you specified), it should work with adding this to your `abyss-pe` command (changing the values as needed):
N=5-20 x32_de="-n5"

The N=5-20 part will do a sweep of 'n' values from 5-20 for abyss-scaffold, and the x32_de setting is important for the 'DistanceEst' step before abyss-scaffold to have an 'n' value equal to the lower bound of your sweep. You can do a dry run (add -n to abyss-pe) to check that these values are being set properly. It will then choose the 'n' value that yields the highest N50.

Good luck, and I hope that helps!
Lauren

Joy Ding

unread,
Apr 21, 2020, 6:00:39 AM4/21/20
to ABySS
Hi Lauren, 

I tried the parameters you suggested. It works. Thank a lot.

But I have another question want to ask. When I checked the log, it show that "Best scaffold N50 is 316 at n=5 s=200."  But the end of the log "abyss-fac" show with "n:500". Why not "n:200" but "n:500"? It's real confusing to me. Please give me some suggestions to understand how it worked.

Thanks for your help.
Joy


螢幕快照 2020-04-21 下午5.48.19.png


a32.51.s200.n2.oe

Lauren Coombe

unread,
Apr 21, 2020, 11:39:40 AM4/21/20
to ABySS
Hi Joy, 

abyss-fac by default uses a length threshold of 500bp for calculating it's statistics. If you want the threshold to be 200bp for the stats, you can run abyss-fac again yourself, specifying -t200.

Lauren
Reply all
Reply to author
Forward
0 new messages