Estimated K is weird.

310 views
Skip to first unread message

Jerry William Kattawar, III

unread,
Oct 30, 2019, 10:51:02 AM10/30/19
to structure-software
I am expecting my population to be panmictic, and based on a PCA from using ~6k SNPs that what I see. However, my optimal K for 1 and 2 are almost identifcal. I thought this was odd so I took a look at the CLUMPP indfiles generated by Harvester for K=2. What I notice is that every individual is at least 90% (most above 95%) from the SAME cluster. However, since I did 10 runs per K, each run was basically random which cluster they were assigned to (sometimes all were in cluster 1, and sometimes all were in cluster 2). I couldn't find anyone having this problem after a quick scan through this forum. Let me know if this is a common occurrence

If you look below I have provided a chunk of the CLUMPP indfile. You can see for the top run all were 90+% in cluster 2, and in the bottom run all were 90+% in cluster 1. This makes K=2 have a higher likelihood score than it should.

1   1 (3) 1 : 0.017 0.983
  2   2 (5) 1 : 0.040 0.960
  3   3 (7) 1 : 0.046 0.954
  4   4 (6) 1 : 0.011 0.989
  5   5 (7) 1 : 0.047 0.953
  6   6 (10) 1 : 0.052 0.948
  7   7 (9) 1 : 0.033 0.967
  8   8 (6) 1 : 0.027 0.973
  9   9 (7) 1 : 0.019 0.981
 10  10 (5) 1 : 0.017 0.983
 11  11 (5) 1 : 0.040 0.960
 12  12 (36) 1 : 0.059 0.941
 13  13 (5) 1 : 0.021 0.979
 14  14 (5) 1 : 0.048 0.952
 15  15 (6) 1 : 0.035 0.965
 16  16 (4) 1 : 0.023 0.977
 17  17 (4) 1 : 0.032 0.968
 18  18 (23) 1 : 0.073 0.927
 19  19 (7) 1 : 0.031 0.969
 20  20 (6) 1 : 0.016 0.984
 21  21 (4) 1 : 0.021 0.979
 22  22 (8) 1 : 0.020 0.980
 23  23 (5) 1 : 0.027 0.973
 24  24 (6) 1 : 0.027 0.973
 25  25 (6) 1 : 0.029 0.971
 26  26 (6) 1 : 0.016 0.984
 27  27 (5) 1 : 0.037 0.963
 28  28 (4) 1 : 0.022 0.978
 29  29 (5) 1 : 0.036 0.964
 30  30 (4) 1 : 0.019 0.981
 31  31 (4) 1 : 0.033 0.967
 32  32 (8) 1 : 0.040 0.960
 33  33 (36) 1 : 0.065 0.935
 34  34 (5) 1 : 0.023 0.977
 35  35 (4) 1 : 0.040 0.960
 36  36 (6) 1 : 0.040 0.960
 37  37 (4) 1 : 0.043 0.957
 38  38 (5) 1 : 0.038 0.962
 39  39 (5) 1 : 0.033 0.967
 40  40 (8) 1 : 0.049 0.951
 41  41 (5) 1 : 0.022 0.978
 42  42 (4) 1 : 0.034 0.966
 43  43 (4) 1 : 0.034 0.966
 44  44 (46) 1 : 0.104 0.896
 45  45 (4) 1 : 0.023 0.977
 46  46 (6) 1 : 0.056 0.944
 47  47 (6) 1 : 0.059 0.941
 48  48 (26) 1 : 0.051 0.949
 49  49 (5) 1 : 0.031 0.969
 50  50 (6) 1 : 0.013 0.987
 51  51 (35) 1 : 0.069 0.931
 52  52 (19) 1 : 0.077 0.923
 53  53 (4) 1 : 0.033 0.967
 54  54 (9) 1 : 0.043 0.957
 55  55 (4) 1 : 0.019 0.981
 56  56 (4) 1 : 0.004 0.996
 57  57 (3) 1 : 0.029 0.971
 58  58 (32) 1 : 0.069 0.931
 59  59 (3) 1 : 0.030 0.970
 60  60 (19) 1 : 0.059 0.941
 61  61 (5) 1 : 0.042 0.958
 62  62 (7) 1 : 0.023 0.977
 63  63 (4) 1 : 0.024 0.976
 64  64 (64) 1 : 0.078 0.922
 65  65 (4) 1 : 0.029 0.971
 66  66 (3) 1 : 0.022 0.978
 67  67 (5) 1 : 0.016 0.984
 68  68 (10) 1 : 0.051 0.949
 69  69 (5) 1 : 0.026 0.974
 70  70 (5) 1 : 0.039 0.961
 71  71 (24) 1 : 0.068 0.932
 72  72 (5) 1 : 0.028 0.972
 73  73 (5) 1 : 0.039 0.961
 74  74 (5) 1 : 0.014 0.986
 75  75 (3) 1 : 0.021 0.979
 76  76 (4) 1 : 0.051 0.949
 77  77 (4) 1 : 0.029 0.971
 78  78 (41) 1 : 0.080 0.920
 79  79 (84) 1 : 0.154 0.846
 80  80 (5) 1 : 0.050 0.950
 81  81 (4) 1 : 0.053 0.947
 82  82 (7) 1 : 0.006 0.994
 83  83 (5) 1 : 0.023 0.977
 84  84 (6) 1 : 0.036 0.964
 85  85 (53) 1 : 0.092 0.908
 86  86 (4) 1 : 0.033 0.967
 87  87 (27) 1 : 0.076 0.924
 88  88 (5) 1 : 0.017 0.983
 89  89 (5) 1 : 0.009 0.991
 90  90 (7) 1 : 0.022 0.978

  

  1   1 (3) 1 : 0.983 0.017
  2   2 (5) 1 : 0.961 0.039
  3   3 (7) 1 : 0.954 0.046
  4   4 (6) 1 : 0.989 0.011
  5   5 (7) 1 : 0.953 0.047
  6   6 (10) 1 : 0.948 0.052
  7   7 (9) 1 : 0.967 0.033
  8   8 (6) 1 : 0.973 0.027
  9   9 (7) 1 : 0.981 0.019
 10  10 (5) 1 : 0.983 0.017
 11  11 (5) 1 : 0.960 0.040
 12  12 (36) 1 : 0.942 0.058
 13  13 (5) 1 : 0.979 0.021
 14  14 (5) 1 : 0.953 0.047
 15  15 (6) 1 : 0.966 0.034
 16  16 (4) 1 : 0.978 0.022
 17  17 (4) 1 : 0.968 0.032
 18  18 (23) 1 : 0.927 0.073
 19  19 (7) 1 : 0.969 0.031
 20  20 (6) 1 : 0.984 0.016
 21  21 (4) 1 : 0.979 0.021
 22  22 (8) 1 : 0.980 0.020
 23  23 (5) 1 : 0.973 0.027
 24  24 (6) 1 : 0.973 0.027
 25  25 (6) 1 : 0.971 0.029
 26  26 (6) 1 : 0.984 0.016
 27  27 (5) 1 : 0.963 0.037
 28  28 (4) 1 : 0.978 0.022
 29  29 (5) 1 : 0.964 0.036
 30  30 (4) 1 : 0.981 0.019
 31  31 (4) 1 : 0.967 0.033
 32  32 (8) 1 : 0.960 0.040
 33  33 (36) 1 : 0.935 0.065
 34  34 (5) 1 : 0.977 0.023
 35  35 (4) 1 : 0.960 0.040
 36  36 (6) 1 : 0.959 0.041
 37  37 (4) 1 : 0.957 0.043
 38  38 (5) 1 : 0.962 0.038
 39  39 (5) 1 : 0.967 0.033
 40  40 (8) 1 : 0.951 0.049
 41  41 (5) 1 : 0.978 0.022
 42  42 (4) 1 : 0.966 0.034
 43  43 (4) 1 : 0.966 0.034
 44  44 (46) 1 : 0.896 0.104
 45  45 (4) 1 : 0.977 0.023
 46  46 (6) 1 : 0.945 0.055
 47  47 (6) 1 : 0.941 0.059
 48  48 (26) 1 : 0.949 0.051
 49  49 (5) 1 : 0.969 0.031
 50  50 (6) 1 : 0.987 0.013
 51  51 (35) 1 : 0.931 0.069
 52  52 (19) 1 : 0.923 0.077
 53  53 (4) 1 : 0.967 0.033
 54  54 (9) 1 : 0.957 0.043
 55  55 (4) 1 : 0.981 0.019
 56  56 (4) 1 : 0.996 0.004
 57  57 (3) 1 : 0.971 0.029
 58  58 (32) 1 : 0.931 0.069
 59  59 (3) 1 : 0.970 0.030
 60  60 (19) 1 : 0.941 0.059
 61  61 (5) 1 : 0.958 0.042
 62  62 (7) 1 : 0.977 0.023
 63  63 (4) 1 : 0.976 0.024
 64  64 (64) 1 : 0.923 0.077
 65  65 (4) 1 : 0.971 0.029
 66  66 (3) 1 : 0.978 0.022
 67  67 (5) 1 : 0.984 0.016
 68  68 (10) 1 : 0.949 0.051
 69  69 (5) 1 : 0.974 0.026
 70  70 (5) 1 : 0.961 0.039
 71  71 (24) 1 : 0.933 0.067
 72  72 (5) 1 : 0.972 0.028
 73  73 (5) 1 : 0.961 0.039
 74  74 (5) 1 : 0.986 0.014
 75  75 (3) 1 : 0.979 0.021
 76  76 (4) 1 : 0.950 0.050
 77  77 (4) 1 : 0.971 0.029
 78  78 (41) 1 : 0.920 0.080
 79  79 (84) 1 : 0.846 0.154
 80  80 (5) 1 : 0.950 0.050
 81  81 (4) 1 : 0.947 0.053
 82  82 (7) 1 : 0.994 0.006
 83  83 (5) 1 : 0.977 0.023
 84  84 (6) 1 : 0.965 0.035
 85  85 (53) 1 : 0.908 0.092
 86  86 (4) 1 : 0.967 0.033
 87  87 (27) 1 : 0.924 0.076
 88  88 (5) 1 : 0.983 0.017
 89  89 (5) 1 : 0.991 0.009
 90  90 (7) 1 : 0.978 0.022

Vikram Chhatre

unread,
Oct 30, 2019, 10:51:44 AM10/30/19
to structure-software
Can you post a barplot to illustrate your problem?

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/structure-software/e5b0de76-33d1-4c02-b14f-021b79eab84d%40googlegroups.com.

Jerry William Kattawar, III

unread,
Oct 30, 2019, 11:00:27 AM10/30/19
to structure-software
I haven't made a bar plot. This issue here is that I have sound data that show my study species is panmictic and the STRUCTURE output is basically agreeing with me. However, throughout the 10 runs I did for K=2, sometimes all of my individuals are in cluster 1, and sometimes they are all in cluster 2. The point is that they cluster together each run, but structure assigns ALL individuals to cluster 1 for a single STRUCTURE run, or ALL individuals to cluster 2 for a single STRUCTURE run. This results in a high DeltaK for K=2 even though all individuals cluster together each run.

Does that make it any more clear?
To unsubscribe from this group and stop receiving emails from it, send an email to structure-software+unsub...@googlegroups.com.

Mario Ernst

unread,
Nov 7, 2023, 1:12:04 PM11/7/23
to structure-software

interesting, I think I have a similar situation. Optimal K identified by structure harvester is 7. However, at K=7 there is an "empty" cluster, meaning that there is not a single individual which is clearly assigned to it. There are many individuals with low admixture proportion being assigned to it. At K=6, the assignment is virtually the same only that this "empty" cluster is not detected. So intuitively I would say it makes more sense to claim that 6 is the optimal K but this is contradicted by the structure harvester results. How did you proceed @Jerry? Would you say structure harvester is identifying the wrong K and how did you work around this issue? Thanks heaps!

NURUL AFIA ABD. MAJID

unread,
May 21, 2024, 9:55:32 AM5/21/24
to structure-software

I think I have a similar issue too. Have you found the solution? When I ran to identify the optimal K, I found that K = 2, but in the Inferred Clusters value, it seems that all of my individuals are assigned to cluster 2. Could you share the solution if you have found it?

Alberto Fameli

unread,
May 15, 2025, 1:34:04 PMMay 15
to structure-software
A high Evanno's DeltaK for K=2 does not necessarily mean the optimal solution is K=2. DeltaK measures "jumps" in likelihood between consecutive K models, and therefore you don't have a value for K=1. 
A peak in DeltaK for K=2 could indicate no structure (K=1) or two clusters.
That's why DeltaK should be used in combination with the traditional method proposed by Pritchard.

Reply all
Reply to author
Forward
0 new messages