FastStructure:ChooseK

1,842 views
Skip to first unread message

Vinod Kumar

unread,
Sep 17, 2015, 8:40:00 AM9/17/15
to structure-software
I am running faststructure for a SNP dataset and whatever I am assuming k=3, k=5, k=10, k=15 but chooseK command returning the same likelihood and structure in data. I've assumed and detected 8 populations through PCA. I am running the following command 
python structure.py -K 15 --input=finalpruned1 --output=genotypes_finalpruned_15

What should I do or I am doing wrong in estimating final pop through chooseK command?

Thanks,

Vinod Kumar

Vikram Chhatre

unread,
Sep 17, 2015, 9:07:45 AM9/17/15
to structure-software
Can you post the chooseK command you are using?

V

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To post to this group, send email to structure...@googlegroups.com.
Visit this group at http://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.

Vinod Kumar

unread,
Sep 17, 2015, 9:50:30 AM9/17/15
to structure...@googlegroups.com

I have run the default command mentioned in the tutorial without changing anything.

Vikram Chhatre

unread,
Sep 17, 2015, 9:53:59 AM9/17/15
to structure-software
You are running chooseK.py on a range of K values, right?  If yes, can you post the output here?

V

Vinod Kumar

unread,
Sep 19, 2015, 4:50:31 AM9/19/15
to structure...@googlegroups.com
Sorry Vikram Chhatre for very late response.

First I have run main script (data is in bed, bim and fam format) ()
$ python structure.py -K 20 --input=genotypes --output=genotypes_finalpruned_20


After this I have run this script  to choose the appropriate number of model components that explain structure in the dataset

$ python chooseK.py --input=genotypes_finalpruned_20

Results of this command are

Model complexity that maximizes marginal likelihood = 20
Model components used to explain structure in data = 20

I am also attaching the output of the faststructure.

am I doing something wrong to asses structure in my dataset or this mean my data doesnt have any structure? or I am missing something.

Thanks,

Vinod Kumar



PostDoc Fellow,
International Wheat Genome Sequencing Consortium (IWGSC),
NRC on Plant Biotechnology,
Indian Agricultural Research Institute,
New Delhi-110012
genotypes_finalpruned_20.20.log
genotypes_finalpruned_20.20.meanP
genotypes_finalpruned_20.20.meanQ

Vikram Chhatre

unread,
Sep 19, 2015, 8:05:14 AM9/19/15
to structure-software
ChooseK needs results from a range of K values.  Also, the -K 20 flag for fastStructure will only test K=20.  

1. Run fastStructure K1 through Kn (multiple runs are to be set manually)
2. python chooseK.py --input=prefix_for_all_K_runs


Vinod Kumar

unread,
Sep 19, 2015, 8:09:03 AM9/19/15
to structure...@googlegroups.com
Thanks a lot Vikram for the valuable answer. I have run only k=20 but now I will run for all the K and then will run the chooseK script.
Thanks,

Vinod Kumar

Hilçana Albuquerque

unread,
Apr 17, 2017, 8:52:42 AM4/17/17
to structure...@googlegroups.com


Good Morning

I'm trying to visualize which individuals belong to each group but I'm not sure how to find this information.

can anybody help me?

Example: Grp Genotype
                  1 1,2,3,4,5,6
                  2 10,11,12,13

Reply all
Reply to author
Forward
0 new messages