fastStructure meanQ file question

707 views
Skip to first unread message

Sarah Flanagan

unread,
May 21, 2015, 1:24:14 PM5/21/15
to structure...@googlegroups.com
Hello,

I'm trying to understand the output files from fastStructure, and I've run into a confusing issue. My understanding of the meanQ file is that it should have one row for each individual and one column for each of the groups (K number of columns). The meanP file should similarly have a row for each SNP and a column for each group.

My problem is that my files do not have the right number of rows! I've run fastStructure on a dataset with 12 populations, with a total of 524 individuals, and I get a meanQ file with only 262 rows. I then tried fastStructure on a subset (62) of the individuals, and that also had a row number equal to half the number of individuals. The meanP file also does not correspond to the number of SNPs in the analysis.

Have I misunderstood how the meanQ and meanP files should be set up? Or is something going wrong?

This is the code I'm using to run fastStructure

python structure.py -K $i --input pop.subset --output=pop_subset_simple --full --format=str --seed=100

Any information or advice you have is much appreciated!

Thank you!

Sarah

Vikram Chhatre

unread,
May 21, 2015, 1:29:36 PM5/21/15
to structure-software
Hi Sarah,

This is one of those exceedingly rare moments when I know exactly what's going on.

1. FastStructure expects ONEROWPERIND=F.  This means two rows per sample, leading to half the number of samples in your results.

2. MeanQ contains, as you correctly interpreted cluster membership coefficients per cluster.  It is a matrix of nrow(length(inds)) and ncol(indlabel+K).

3. The meanP is a matrix of allele frequencies per locus per cluster.  Thus, nrow(length(loci)) x ncol(loclabel+K).

Unfortunately, what this means is that the results you obtained are invalid.  

PgdSpider is a very nice utility to convert between file formats and it is scalable to any number of marker/inds combination and projects involving large number of data sets.

Good luck!
V


Sarah

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To post to this group, send email to structure...@googlegroups.com.
Visit this group at http://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.

Sarah Flanagan

unread,
May 21, 2015, 2:10:05 PM5/21/15
to structure...@googlegroups.com
That helps a lot! Thank you! 
To unsubscribe from this group and stop receiving emails from it, send an email to structure-software+unsub...@googlegroups.com.

Otyama Paul

unread,
Dec 3, 2016, 5:02:06 PM12/3/16
to structure-software
Hi Vikram Chhatre (and Sarah Flanagan),

I ran into the same problem. My meanQ file has only half the number of rows as the individuals I had in my input file. I converted my data from FASTA to the structure format using PGD but still got the same out put.

How do i resolve this? Can you help

Vikram Chhatre

unread,
Dec 3, 2016, 5:42:51 PM12/3/16
to structure...@googlegroups.com
I already responded to your question earlier.

V

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To post to this group, send email to structure...@googlegroups.com.

Otyama Paul

unread,
Dec 3, 2016, 5:46:45 PM12/3/16
to structure-software
My accept my apologies,

I can't find that particular thread(I just joined the forum yesterday). Could you please re-iterate your quick response here...

I've been stuck in office all day trying to get this to work. (Please)

Reply all
Reply to author
Forward
Message has been deleted
0 new messages