Fst output different

91 views
Skip to first unread message

Peter Pruisscher

unread,
May 21, 2013, 4:46:13 AM5/21/13
to stacks...@googlegroups.com
Dear Julian and Stacks users,

I have a different output in populations compared to what I get in the mysql database. I was hoping you could help me out.

My current analysis involves 23 individuals divided into 2 populations. I ran denovo_map.pl, built my PE with velvet, and then ran populations.
I call populations like this:

$ populations -P ./stacks/ -M ./popmap -r .90 -p 2 -b 1 --structure

After populations, I went to the batch_1.fst_1-2.tsv file to sort by FST values, and zoomed in on the highest outliers:

# Batch ID Locus ID Pop 1 ID Pop 2 ID Chr BP Column Overall Pi Fst
1 77811 1 2 un 7236407 0 0.0714286 0.5504587156

Above you can see tag ID 77811 with an FST of 0.55. BUT, when I go to my mysql database: Catalog ID 77811 has an FST 0.16. This FST value corresponds to what I expect in this tag, as 17 of 23 individuals have complete consensus.
Does anyone know what I am missing here?

Thank you for your help.
//Peter

Peter Pruisscher

unread,
May 21, 2013, 5:03:11 AM5/21/13
to stacks...@googlegroups.com
Hi again,

As extra info, if I call populations without the -r and -p filters like this:

$ populations -P ./stacks/ -M ./popmap -b 1 --structure

I get the following for ID 77811 in batch_1.fst_1-2.tsv:
# Batch ID Locus ID Pop 1 ID Pop 2 ID Chr BP Column Overall Pi Fst
1 77811 1 2 un
7236399 0 0.176539 -0.0225653207
1 77811 1 2 un 7236407 0 0.0714286 0.5504587156
1 77811 1 2 un
7236409 0 0.107937 0.1631878558

Where the 3rd column is the correct FST value. (77811 is a 100 bp sequence with 3 SNP in it).

Does the discrepancy have to do with the -r -p filters where it might exclude one or two of the SNPs? Wouldn't it exclude the whole catalog ID?

Thanks //Peter

Op dinsdag 21 mei 2013 10:46:13 UTC+2 schreef Peter Pruisscher het volgende:

Peter Pruisscher

unread,
May 21, 2013, 6:58:56 AM5/21/13
to stacks...@googlegroups.com
Hi again,

I found out now that yes, populations looks at each individual SNP, so I understand my output above.

BUT, shouldn't I then get 3 FST values when I look up the catalog ID 77811, instead of only the FST for the 3rd SNP?

Thanks
//Peter

Op dinsdag 21 mei 2013 11:03:11 UTC+2 schreef Peter Pruisscher het volgende:

Julian Catchen

unread,
May 28, 2013, 6:10:56 PM5/28/13
to stacks...@googlegroups.com
Hi Peter,

You should get 3 separate Fst values, one for each SNP, unless for populations 1
and 2, the second and third SNPs are fixed. So, Fst is only calculated if a SNP
is segregating in the two focal populations being compared. To see if all three
SNPs were indeed processed, grep that locus out of the sumstats file. You should
see all the SNPs in that case.

Best,

julian
> I ran denovo_map.pl <http://denovo_map.pl>, built my PE with velvet, and
Reply all
Reply to author
Forward
0 new messages