Missing Fst values when comparing populations

364 views
Skip to first unread message

Christine Dodge

unread,
Mar 3, 2016, 5:01:56 PM3/3/16
to Stacks
Hello,

I am new to Stacks and I'm trying to run populations on a 96-sample data set with 11 specified populations (for background - these are fungal isolates taken from 11 different beetles). When I try to calculate pairwise Fst values, the matrices are always empty and look like this:

1 2 3 4 5 6 7 8 9 10 11

1 -nan -nan -nan -nan -nan -nan -nan -nan -nan -nan

2 -nan -nan -nan -nan -nan -nan -nan -nan -nan

3 -nan -nan -nan -nan -nan -nan -nan -nan

4 -nan -nan -nan -nan -nan -nan -nan

5 -nan -nan -nan -nan -nan -nan

6 -nan -nan -nan -nan -nan

7 -nan -nan -nan -nan

8 -nan -nan -nan

9 -nan -nan

10 -nan


When I run the analysis grouping all 96 samples as one population, it calculates population summary statistics normally. If the samples are run as multiple populations, the summary stats also produce a lot of -nan values. Is this a computational error, or is it because there is no difference between my populations? If that is the case, shouldn't all of the Fst values be 0? I guess what I'm trying to ask is what does the -nan mean? Clearly I need to troubleshoot but I don't know where to start. 


The code I've been using is as follows:


populations -b 1 -P ~/bigdata/ddRAD_Oct2015_Fusarium/Mar03_strictmap_popbeetles -M ~/bigdata/ddRAD_Oct2015_Fusarium/populations/populations_beetles -m 5 -p 2 -f p_value --fstats --fasta_strict --vcf --vcf_haplotypes --phylip --structure --log_fst_comp --verbose


Thank you!

Chrissy

Julian Catchen

unread,
Mar 4, 2016, 7:37:57 AM3/4/16
to stacks...@googlegroups.com, cdod...@ucr.edu
Hi Chrissy,

'-nan' means the negative value of 'not a number'. This happens when the
code divides by a number that is not representable, like 0. Usually this
occurs when there is no data present to do the calculation. I would look
at how many individuals and how many loci are surviving the analysis
when you run populations on susbsets of the data.

Best,

julian

Christine Dodge

unread,
Mar 10, 2016, 7:57:57 PM3/10/16
to Stacks, cdod...@ucr.edu, jcat...@illinois.edu
Hi Julian,

Thank you for the reply! When I look at my populations.sh log file, it says it retained 114947 loci before it started calculating F statistics. However, it didn't write any loci to the pairwise haplotype
files, although they are all being written to the haplotypes.tsv file. Does this mean that none of my populations share any loci with any of the other populations?
Reply all
Reply to author
Forward
0 new messages