Interpretation results for Structure Harvester

4,890 views
Skip to first unread message

belinda

unread,
Jul 27, 2012, 2:12:53 AM7/27/12
to structure...@googlegroups.com
Hi,
I have done a Structure analysis for 6 "populations" and 252 individuals with the default settings, a burn-in of 10000 and 50000 replications. When I analyse my results with Structure Harvester K=4 has the highest probabilty (see below). However, I cannot think about a biological reason that could explain this 4 clusters. In addition, when I look at one plot for K=4 (see below) it seems to me that there is no structure at all. What do you think about my results? How would you interpret them?
Thanks for your help.


structure without H.jpg

belinda

unread,
Jul 27, 2012, 2:15:57 AM7/27/12
to structure...@googlegroups.com
Unbenannt.png
Unbenannt2.png

Catherine Johnston

unread,
Jul 27, 2012, 4:07:49 AM7/27/12
to structure...@googlegroups.com
Are some of the populations closer to each other than to others? i.e. high gene flow between the "populations" so the structuring is less distinctive. Or are there temporal samples included?
Was 10000 burn-in enough to achieve convergence? (check the alpha plot to see that it has levelled off) I would maybe try increasing it.

Cathy
________________________________________
From: structure...@googlegroups.com [structure...@googlegroups.com] On Behalf Of belinda [beli...@gmx.de]
Sent: 27 July 2012 07:12
To: structure...@googlegroups.com
Subject: [structure-group] Interpretation results for Structure Harvester

Hi,
I have done a Structure analysis for 6 "populations" and 252 individuals with the default settings, a burn-in of 10000 and 50000 replications. When I analyse my results with Structure Harvester K=4 has the highest probabilty (see below). However, I cannot think about a biological reason that could explain this 4 clusters. In addition, when I look at one plot for K=4 (see below) it seems to me that there is no structure at all. What do you think about my results? How would you interpret them?
Thanks for your help.
[][]


--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/Rt7piNZmZUcJ.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-softw...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/structure-software?hl=en.

belinda

unread,
Jul 30, 2012, 2:37:48 AM7/30/12
to structure...@googlegroups.com
Dear Cathy,
thanks for you reply. Yes, some of the populations are closer to each other, but the distance between all populations is generally low. In addition, I have done a Isolation-by-distance analysis which showed that there is no IBD. Thus, I don't think that this is the reason why there is some strucutre. Do you really think the results for K=4 looks like there is some real structure? I think all populations look quite mixed and there is no structure at all. However, I don't understand why K=4 and not K=1 is the most likely?! What do you think about my alpha? Do you think it has already converged? Thanks for your help again.



On Friday, July 27, 2012 10:07:49 AM UTC+2, Cathy wrote:
Are some of the populations closer to each other than to others? i.e. high gene flow between the "populations" so the structuring is less distinctive. Or are there temporal samples included?
Was 10000 burn-in enough to achieve convergence? (check the alpha plot to see that it has levelled off) I would maybe try increasing it.

Cathy
________________________________________
From: structure-software@googlegroups.com [structure-software@googlegroups.com] On Behalf Of belinda [beli...@gmx.de]
Sent: 27 July 2012 07:12
To: structure-software@googlegroups.com
To post to this group, send email to structure-software@googlegroups.com.
To unsubscribe from this group, send email to structure-software+unsub...@googlegroups.com.
1.jpg

Cecilia Carrea

unread,
Jul 30, 2012, 2:59:19 AM7/30/12
to structure...@googlegroups.com
Dear Belinda,

According to the manual alpha should be relatively constant (with a range of 0.2 or less). Don't forget you can to look at the Q values for each individual and you can also calculate the posterior probabilities of K, in page 13 of the manual (I have the one for version 2.2) it explains how to compute them (ecuation 4).

In addition, you can try calculating pairwise Fst (you can do this with many different software) or Dst (you can use http://www.ngcrawford.com/django/jost/) as a meassure of genetic differentiation between your 'populations'.

I hope this helps, Good luck!
Cecilia.

2012/7/30 belinda <beli...@gmx.de>
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/I84Nde4VCDkJ.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-softw...@googlegroups.com.

K.K.Vinod

unread,
Jul 30, 2012, 3:04:16 AM7/30/12
to structure...@googlegroups.com
Here is a recent publication on STRUCTURE HARVESTER. Good Luck!!

Vikram Chhatre

unread,
Jul 30, 2012, 9:30:18 AM7/30/12
to structure...@googlegroups.com
Belinda:

Your alpha looks fine to me. It seems to have converged. Alpha will
converge to a distribution of values, so some fluctuation is normal.
There is a pointer on this matter by Dr. Pritchard which I posted to
the list a while ago. Please take a look at that.

Delta K is the rate of change of log likelihood. So it is correctly
reflecting the drastic change you're observing in log probability
between k4 and k5. But also note that your k5 and k6 lnpd have a lot
of noise (variance between independent runs). If this is not caused
by non-convergence, you may have to run more mcmc steps during the
data collection phase.

Plotting the membership assignment probabilities using CLUMPP and
DISTRUCT will shed more light on the matter.

All the best
V
>>>> From: structure...@googlegroups.com
>>>> [structure...@googlegroups.com] On Behalf Of belinda
>>>> [beli...@gmx.de]
>>>> Sent: 27 July 2012 07:12
>>>> To: structure...@googlegroups.com
>>>> Subject: [structure-group] Interpretation results for Structure
>>>> Harvester
>>>>
>>>> Hi,
>>>> I have done a Structure analysis for 6 "populations" and 252 individuals
>>>> with the default settings, a burn-in of 10000 and 50000 replications. When I
>>>> analyse my results with Structure Harvester K=4 has the highest probabilty
>>>> (see below). However, I cannot think about a biological reason that could
>>>> explain this 4 clusters. In addition, when I look at one plot for K=4 (see
>>>> below) it seems to me that there is no structure at all. What do you think
>>>> about my results? How would you interpret them?
>>>> Thanks for your help.
>>>>
>>>> [][]
>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "structure-software" group.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msg/structure-software/-/Rt7piNZmZUcJ.
>>>> To post to this group, send email to

belinda

unread,
Aug 1, 2012, 2:18:54 AM8/1/12
to structure...@googlegroups.com
Thanks to all of you for the replies. I have already calculated Fst as Cathy suggested and nearly all of my populations are significantly differentiated, although Fst is relatively small. When I understand you correct iit seems that my alpha looks okay, right? So non-convergence is relatively unlikely. However, I still don't understand why STRUCTURE or STRUCTURE HARVESTER does not find K =1 is the most likely, because when I look at the STRUCTURE plots I can see no structure. Can you see some structure in it? If yes, which populations (numbers below the graph) do you think does Structure cluster together?

On Friday, July 27, 2012 8:12:53 AM UTC+2, belinda wrote:
Message has been deleted

Cathy

unread,
Aug 1, 2012, 3:55:02 AM8/1/12
to structure...@googlegroups.com
Have you looked at the bar plots for k2 and 3? You can see how the structuring develops and it may be more clear/make more sense.
Alternatively (and someone please correct me as I'm likely wrong!) could K4 be chosen due to the small difference between L(k) 3 and 4 and the large difference between 4 and 5?

Adii_ (Institute of Genetics and Animal Breeding)

unread,
Aug 1, 2012, 4:09:00 AM8/1/12
to structure...@googlegroups.com
One more question - Hove You use CLUMPP to make the plots more 'organized'? On graph You attached all I can see is chaos :))

Hove You made the analyses with bigger amount of MCMC? 

Adii_

Julie Hebert

unread,
Aug 6, 2012, 11:35:16 AM8/6/12
to structure...@googlegroups.com
Belinda,
Your convergence looks okay. How many runs did you do of each K? If you haven't run CLUMPP and distruct yet, I highly recommend it. (If that is what your histogram is from, I apologize for not noticing.) If that is your histogram from CLUMPP, then yes, it looks to me like there is no real structure to your data. I find it unusual, however, that your best delta K would be four. The one time I've seen structure like this was when the best delta K was 1.
Cathy is correct that you should try looking at your histograms from K=2 to (I would say 6 for your a priori expectation) to see if any structure emerges from the data.  I've actually written about this before, so I'm attaching a simple example figure I came up with to describe my point. The scenario is that the best delta K is 3, but that the real K appears to be 2 because the 3rd K does not actually add structure to the data.
I hope that helps,
Julie
demo of choosing k.jpeg

john

unread,
Nov 17, 2012, 12:02:40 PM11/17/12
to structure...@googlegroups.com
I determined the number of population clusters,but  how can I list individual member ship for each cluster I.e to partition 151 individuals to 4 k class  group.
>> http://www.ngcrawford.com/django/jost/ ) as a meassure of genetic
>>>> structure-software+unsub...@googlegroups.com.
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/structure-software?hl=en.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "structure-software" group.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msg/structure-software/-/I84Nde4VCDkJ.
>>> To post to this group, send email to structure...@googlegroups.com.
>>> To unsubscribe from this group, send email to
>>> structure-software+unsub...@googlegroups.com.
>>> For more options, visit this group at
>>> http://groups.google.com/group/structure-software?hl=en.
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "structure-software" group.
>> To post to this group, send email to structure...@googlegroups.com.
>> To unsubscribe from this group, send email to
>> structure-software+unsub...@googlegroups.com.
>> For more options, visit this group at
>> http://groups.google.com/group/structure-software?hl=en.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "structure-software" group.
> To post to this group, send email to structure...@googlegroups.com.
> To unsubscribe from this group, send email to
> structure-software+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages