Why does the distruct plots change color at different K's for the same population?

697 views
Skip to first unread message

upendra kumar Devisetty

unread,
Nov 10, 2015, 5:57:23 PM11/10/15
to structure-software
I am running fastStructure with different K's and i noticed the following:

K = 2:
population 1 = red and population 2 = red + light blue 

K = 3:
population 1 = red, population 2 = green and population 3 = dark blue 

I definitely know that the Population 1 has not split into 2 clusters and so i was expecting Population 3 in K = 3 to be red (because it is not split) but instead it is dark blue. 

How to fix this?  

Thanks
Upendra

Naama Kopelman

unread,
Nov 11, 2015, 3:03:16 AM11/11/15
to structure-software
Hi Upendra,

You can bypass this problem by using CLUMPAK. You can upload fastStructure's results to CLUMPAK's server,
and CLUMPAK will coordinate the colors across different K values.

Best,

Naama

Vikram Chhatre

unread,
Nov 11, 2015, 8:34:30 AM11/11/15
to structure-software
Upendra -

By definition, K=2 will show admixture in two clusters using two colors, K=3 using three, so on and so forth.  Also, distruct.py and it's variant distruct2.1.py will not, by default, assign the same color to a given clusters in successive K plots.  You have to manually arrange those.

Posting your barplot here will likely get you more specific answers.

V

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To post to this group, send email to structure...@googlegroups.com.
Visit this group at http://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

upendra kumar Devisetty

unread,
Nov 11, 2015, 1:57:30 PM11/11/15
to structure-software
Dear Vikram,

Thanks for your reply.  I have now uploaded the picture


upendra kumar Devisetty

unread,
Nov 11, 2015, 2:11:33 PM11/11/15
to structure-software
Hi Naama,

I am aware of the CLUMPAK but i thought it only accepts STRUCTURE output and not FastStructure output. Please comment.

Thanks
Upendra

Vikram Chhatre

unread,
Nov 11, 2015, 2:18:10 PM11/11/15
to structure-software
Your output appears to have been sorted by Q, not by population identifiers.  That said, does your input file contain population definitions?

--

upendra kumar Devisetty

unread,
Nov 11, 2015, 2:34:59 PM11/11/15
to structure-software
Hi Vikram,

I am pasting few lines of the .fam and pop file here (since i can't attach those in here)

.fam file - 
-9 1KS_Keizer_F_OR -9 -9 -9 -9
-9 2861_Lassen_NF_CA -9 -9 -9 -9
-9 2862_Lassen_NF_CA -9 -9 -9 -9
-9 2A_BlackButte_NF_OR -9 -9 -9 -9
-9 2KS_Keizer_F_OR -9 -9 -9 -9
-9 3KS_Keizer_F_OR -9 -9 -9 -9
-9 4KS_Keizer_F_OR -9 -9 -9 -9
-9 5B_BlackButte_NF_OR -9 -9 -9 -9
-9 5C_BlackButte_NF_OR -9 -9 -9 -9
-9 A1_Albany_F_OR -9 -9 -9 -9
-9 A3_Albany_F_OR -9 -9 -9 -9
-9 A4_Albany_F_OR -9 -9 -9 -9
-9 A5_Albany_F_OR -9 -9 -9 -9

pop file - 
KS_Keizer_F_OR
Lassen_NF_CA
Lassen_NF_CA
BlackButte_NF_OR
Keizer_F_OR
Keizer_F_OR
Keizer_F_OR
BlackButte_NF_OR
BlackButte_NF_OR
Albany_F_OR
Albany_F_OR
Albany_F_OR
Albany_F_OR

Also i am including my commands that i have used to generate those distruct plots that i have attached earlier.

python structure.py -K 2 --input=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.plk --output=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.plk_simple --full --seed=100
python structure
.py -K 3 --input=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.plk --output=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.plk_simple --full --seed=100

python distruct
.py -K 2 --input=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.plk_simple --output=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.simple_K2_distruct_nolable.pdf --title="Aspen GBS Population Structure 4b & mMAF0.01 K = 2"
python distruct
.py -K 3 --input=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.plk_simple --output=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.simple_K3_distruct_nolable.pdf --title="Aspen GBS Population Structure 4b & mMAF0.01 K = 3"



Thanks again,
Upendra


Vikram Chhatre

unread,
Nov 11, 2015, 2:40:49 PM11/11/15
to structure-software
You need to explicitly set the --popfile= flag in distruct.  Otherwise your barplot is getting sorted by Q.

V

upendra kumar Devisetty

unread,
Nov 11, 2015, 2:59:48 PM11/11/15
to structure-software

Thanks Vikram. But even using the --popfile argument, the coloring is not consistent among different K's. 

As you can see in the attached distruct plot for K = 2, all the Canadian samples (population 1) are red colored and non Canadian samples (population 2) are blue. Whereas for K = 3, all the Canadian samples are now blue and the non Canadian samples are now split into two clusters - one with red and one with blue. I want to keep the red color consistent among different K's for a particular population. Red for Canadian samples for both K = 2 and K = 3. 


Vikram Chhatre

unread,
Nov 11, 2015, 3:04:30 PM11/11/15
to structure-software
There are several things at play here.  Before we beat our heads against the wall on this any further, I recommend that you get the distruct2.1 from my github page.  It allows you to explicitly set population order.  This will make sure that regardless of what K you plot (2 vs 3 for example), your population order remains same. Note that in the plots you attached, the poporder is different for K2 and K3.  

When you run distruct2.1 you will need an additional file called 'poporder'.  This file contains one pop per line, in the desired order.  No flag needs to be set.  Just place the file in the same directory.  Then post your plots here.  We can tackle it further at that point.

V

upendra kumar Devisetty

unread,
Nov 11, 2015, 3:46:03 PM11/11/15
to structure-software
Vikram,

Unfortunately it didn't work again. Here is my poporder file (1 population per line)

KS_Keizer_F_OR
Lassen_NF_CA
BlackButte_NF_OR
Albany_F_OR
Angelsrest_F_OR
Brooks_F_OR
BTG_ClarkCounty_F_WA
Calapooia_F_OR
Camassia_F_OR
LakeTahoe_NF_CA
Selzer11_Cheney_F_WA
DRP_LewisCounty_NF_WA
LewisCounty_F_WA
GladeCreek_NF_OR
WallaWalla_F_WA
KlamathFalls_NF_OR
ThurstonCounty_NF_WA
LewisCounty_NF_WA
LakeOswego_F_OR
Peoria_F_OR
PanakaniMeadows_Unknown_WA
OreilleCounty_F_WA
PierceCounty_NF_WA
KittitasCounty_NF_WA
SteamboatSprings_NF_CO
BoiseNationalForest_NF_ID
FlinFlon_Canada
GrandPrairie_NF_Canada
Hinton_NF_Canada
HavreStPierre_NF_Canada
OntonagonCounty_NF_MI
Montana_F_MT
YellowstonePark_NF_MT
SaintFelicien_NF_Canada
SFRGPoly40_SwanFlat_NF_UT

Here are my commands 

python distruct2.1.py -K 2 --input=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.plk_simple --output=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.simple_K2_distruct2.1_nolable.pdf --title="Aspen GBS Population Structure 4b & mMAF0.01 K = 2"
python distruct2
.1.py -K 3 --input=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.plk_simple --output=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.simple_K3_distruct2.1_nolable.pdf --title="Aspen GBS Population Structure 4b & mMAF0.01 K = 3"



As you can see in the attached plot, i am having the same issue as before. 

Any ideas where i am going wrong?

Vikram Chhatre

unread,
Nov 11, 2015, 4:00:03 PM11/11/15
to structure-software
As I mentioned earlier, you need to explicitly set the --popfile= flag.  I don't see that flag in your last round of plotting.

V

--

upendra kumar Devisetty

unread,
Nov 11, 2015, 5:06:59 PM11/11/15
to structure-software
Sorry Vikram but i am having the same issue even with setting the --popfile= flag

python distruct2.1.py -K 2 --input=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.plk_simple --output=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.simple_K2_distruct2.1_nolable.pdf --popfile=test/plink_popfile_st_mo_4b_mMAF.2 --title="Aspen GBS Population Structure 4b & mMAF0.01 K = 2"
python distruct2
.1.py -K 3 --input=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.plk_simple --output=test/steve_mock_4b_mMA/HapMap.hmp.4base_filtered_mod_final_mMAF0.01_mod.renamed.simple_K3_distruct2.1_nolable.pdf --popfile=test/plink_popfile_st_mo_4b_mMAF.2 --title="Aspen GBS Population Structure 4b & mMAF0.01 K = 3"




Vikram Chhatre

unread,
Nov 11, 2015, 5:13:39 PM11/11/15
to structure-software
Let's take this off the list for now.

V

Reply all
Reply to author
Forward
0 new messages