stacks-integrate-alignment and problems in vcf

258 views
Skip to first unread message

Loreleï Boyer

unread,
Dec 16, 2020, 8:40:50 AM12/16/20
to Stacks

Hello,

I seem to be encountering a problem with stacks-integrate-alignment (I'm using Stacks 2.54):

I find the following error when putting the produced VCF into bcftools :

 > bcftools view  --output sorted_populations.snps.vcf  --output-type v  populations.snps.vcf
[E::bcf_write] Data contains 64-bit values not representable in BCF.  Please use VCF instead
[buf_flush] Error: cannot write to /tmp/bcftools-sort.JMzA9L/00001.bcf

The VCF is produced with the de novo Stacks pipeline + stacks-integrate-alignment (as suggested in Paris et al, 2017 and Rochette & Catchen, 2017). For my analyses, I need to use bcftools. To do so, I add the required "contigs" to the header using a custom script. I do not encounter this probleme with the de novo Stacks pipeline without stacks-integrate-alignment or the reference-based Stacks pipeline.

Moreover, I have isolated the problem to be caused by these two lines :

959036    4294967296    36940:76:-    G    A    .    PASS    NS=40;AF=0.262    GT:DP:AD:GQ:GL    0/0:64:64,0:40:0.00,-19.27,-30.54    0/0:119:119,0:40:0.00,-35.82,-56.78    0/0:161:161,0:40:0.00,-48.47,-76.82    0/1:85:53,32:40:-39.72,-25.59,-49.74    0/1:100:80,20:16:-31.27,-30.10,-59.90    0/1:9:4,5:24:-5.07,-2.71,-4.59    0/0:42:42,0:40:0.00,-12.64,-20.04    0/1:64:31,33:40:-35.00,-19.27,-34.04    1/1:35:2,33:40:-19.07,-10.54,-4.28    0/1:96:24,72:40:-57.80,-28.90,-34.90    0/0:65:65,0:40:0.00,-19.57,-31.01    0/0:98:98,0:40:0.00,-29.50,-46.76    0/0:21:20,1:40:-2.22,-6.32,-11.29    ./.    0/0:36:36,0:40:0.00,-10.84,-17.18    0/0:90:75,15:29:-24.77,-27.09,-53.39    0/0:49:49,0:40:0.00,-14.75,-23.38    0/0:48:48,0:40:0.00,-14.45,-22.90    0/1:33:24,9:34:-12.69,-9.93,-19.85    0/1:49:22,27:40:-27.52,-14.75,-25.14    0/1:34:22,12:40:-15.31,-10.24,-20.08    0/1:126:63,63:40:-67.99,-37.93,-67.99    0/0:156:156,0:40:0.00,-46.96,-74.43    0/1:199:137,62:40:-83.19,-59.90,-118.98    0/0:207:207,0:40:0.00,-62.31,-98.76    0/0:133:133,0:40:0.00,-40.04,-63.46    0/1:238:114,124:40:-130.72,-71.65,-125.95    1/1:90:9,81:40:-51.35,-27.09,-17.00    0/0:129:129,0:40:0.00,-38.83,-61.55    ./.    0/0:90:90,0:40:0.00,-27.09,-42.94    0/0:182:182,0:40:0.00,-54.79,-86.84    0/1:101:71,30:40:-41.00,-30.40,-60.56    0/0:50:50,0:40:0.00,-15.05,-23.86    0/0:60:60,0:40:0.00,-18.06,-28.63    0/0:114:114,0:40:0.00,-34.32,-54.39    0/1:95:60,35:40:-43.85,-28.60,-55.78    1/1:54:3,51:40:-29.37,-16.26,-6.46    0/1:120:50,70:40:-68.79,-36.12,-59.25    0/1:156:60,96:40:-90.94,-46.96,-73.77    0/0:134:134,0:40:0.00,-40.34,-63.93    0/0:101:101,0:40:0.00,-30.40,-48.19

988483    4294967296    38002:50:-    T    G    .    PASS    NS=32;AF=0.375    GT:DP:AD:GQ:GL    0/0:13:13,0:40:0.00,-3.91,-6.20    ./.    1/1:18:0,18:40:-8.59,-5.42,0.00    0/0:31:31,0:40:0.00,-9.33,-14.79    0/1:10:6,4:24:-4.83,-3.01,-5.79    0/0:8:7,0:25:-1.79,-3.72,-3.82    0/0:6:6,0:24:0.00,-1.81,-2.86    1/1:3:0,3:13:-1.43,-0.90,0.00    0/1:11:4,7:23:-6.47,-3.31,-5.04    0/0:7:7,0:27:0.00,-2.11,-3.34    ./.    ./.    0/0:4:4,0:17:0.00,-1.20,-1.91    ./.    0/1:17:9,8:40:-8.92,-5.12,-9.40    0/0:21:21,0:40:0.00,-6.32,-10.02    1/1:4:0,4:17:-1.91,-1.20,0.00    ./.    0/1:9:4,5:24:-5.07,-2.71,-4.59    ./.    ./.    ./.    0/1:17:8,9:40:-9.40,-5.12,-8.92    1/1:27:0,27:40:-12.88,-8.13,0.00    0/0:9:9,0:33:0.00,-2.71,-4.29    0/0:15:15,0:40:0.00,-4.52,-7.16    ./.    0/0:14:14,0:40:0.00,-4.21,-6.68    0/1:41:27,14:40:-18.11,-12.34,-24.31    0/0:10:9,1:16:-1.89,-3.01,-5.71    1/1:20:0,20:40:-9.54,-6.02,0.00    0/1:13:4,9:20:-7.78,-3.91,-5.39    0/0:4:4,0:17:0.00,-1.20,-1.91    0/0:7:7,0:27:0.00,-2.11,-3.34    0/0:7:7,0:27:0.00,-2.11,-3.34    1/1:7:0,7:27:-3.34,-2.11,0.00    0/1:13:6,7:35:-7.24,-3.91,-6.76    ./.    0/1:27:8,19:34:-16.19,-8.13,-10.94    0/1:15:5,10:26:-8.92,-4.52,-6.53    0/1:31:10,21:40:-18.49,-9.33,-13.24    0/1:25:7,18:28:-15.03,-7.53,-9.78

In particular, the POS field (second field) is 4294967296 for both snps, which is way greater than the size of both of the contigs in question.

If I remove both lines, the vcf is accepted by bcftools.

I don't know what caused this wrong POS. Could there also be other undetected problems within the rest of the VCF ?

Thank you in advance,

Loreleï

Loreleï Boyer

unread,
Jan 28, 2021, 8:31:24 AM1/28/21
to Stacks
Hello again,

I am still encountering this problem with stacks-integrate-alignment. Using the same data but with another version of the reference genome, the VCF again contained two SNPs with the position 4294967296.

Have a nice day,

Loreleï
Reply all
Reply to author
Forward
0 new messages