Interpreting a common barplot - when there is a fixed q value across all populations, but population structure is evident within the remaining q assignments

198 views
Skip to first unread message

Olly Bolly

unread,
May 31, 2016, 11:32:17 PM5/31/16
to structure-software
Hello,

I'm interested in how others interpret this pattern (see image). I think it may reflect that Structure is having a problem dealing with the volume of data and/or that I need to tweak the prior settings.  If so, Id be grateful for advice.

All individuals (no matter what population of origin) have a reasonably high q assignment to a single cluster - often q > 0.5. Typically this takes the form of a fairly even stripe across the bottom of the bar plot. However, in the "remaining" q of each individual there is clear evidence for population sub-division.  Its as though the expected barplot is just shifted into the upper part of the barplot above the "across the board" stripe/cluster. That is, except for the across the board cluster, individuals have quite distinct q values to different clusters and this has a geographic basis.  

I have other evidence (e.g. PCA, DAPC etc) that there is geographic structure evident in the data, so it seems like a problem with the clustering.


Dataset is c. 6000 snps, 200,000 burn in, 500,000 iterations, 20 reps per K. admixture, correlated allele frequencies. Remainder of priors set at default values.

Cheers,

Olly

Structure example.png

Olly Bolly

unread,
Jun 1, 2016, 12:16:37 AM6/1/16
to structure-software
As a follow-up to this post, I don't get this effect if I exclude populations with < 20 individuals. Instead I get the same patterns of geographic structuring that previously I could see above the "across the board' cluster/stripe.  This could indicate a problem with small samples, or with large numbers of individuals - though I get the same result if I use location prior or if I don't.
Structure example2.png

zeamne T

unread,
Jun 1, 2016, 9:55:47 PM6/1/16
to structure...@googlegroups.com
Dear Olly Bolly,

What about the bar plots at K=2? If I'm not wrong the "across the board stripe" often comes up when there is no structure across your samples, or if the K is not optimal (usually too high). 

Cheers,
yujie

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To post to this group, send email to structure...@googlegroups.com.
Visit this group at https://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.

Olly Bolly

unread,
Jun 2, 2016, 8:18:40 PM6/2/16
to structure-software
Hi Yujie,

Yes I know what you mean - often when there is no structure evident in the data you see K even sized stripes across the barplot for all values of K that you run.  However, this is a different situation since: a) I have other evidence for geographic structuring (PCA, Structure run  including only larger sample sizes); and b) that geographic structuring is reflected in the upper portion of the barplot.  

But I also attach barplots for K = 2 and K = 3 for your info.  Not much to see in K = 2, but K = 3 reverts to the problematic pattern.

Its not that easy to troubleshoot this sort of issue because of the long run time required for structure, so I was hoping somebody else had stumbled on it too ;-/.


Cheers,

Olly


On Thursday, June 2, 2016 at 9:55:47 AM UTC+8, YC Tay wrote:
Dear Olly Bolly,

What about the bar plots at K=2? If I'm not wrong the "across the board stripe" often comes up when there is no structure across your samples, or if the K is not optimal (usually too high). 

Cheers,
yujie

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-software+unsub...@googlegroups.com.
Picturetest.emf

zeamne T

unread,
Jun 2, 2016, 9:40:17 PM6/2/16
to structure...@googlegroups.com
Hi Olly can you please attach the barplot in jpeg, pdf or a more common format?

On 3 June 2016 at 08:18, Olly Bolly <oliver...@csiro.au> wrote:
Hi Yujie,

Yes I know what you mean - often when there is no structure evident in the data you see K even sized stripes across the barplot for all values of K that you run.  However, this is a different situation since: a) I have other evidence for geographic structuring (PCA, Structure run  including only larger sample sizes); and b) that geographic structuring is reflected in the upper portion of the barplot.  

But I also attach barplots for K = 2 and K = 3 for your info.  Not much to see in K = 2, but K = 3 reverts to the problematic pattern.

Its not that easy to troubleshoot this sort of issue because of the long run time required for structure, so I was hoping somebody else had stumbled on it too ;-/.


Cheers,

Olly


On Thursday, June 2, 2016 at 9:55:47 AM UTC+8, YC Tay wrote:
Dear Olly Bolly,

What about the bar plots at K=2? If I'm not wrong the "across the board stripe" often comes up when there is no structure across your samples, or if the K is not optimal (usually too high). 

Cheers,
yujie

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To post to this group, send email to structure...@googlegroups.com.
Visit this group at https://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.

Olly Bolly

unread,
Jun 6, 2016, 10:08:47 PM6/6/16
to structure-software
Hi,

Sure - sorry about that.  Here is a plot of K between 2 and 4 including location priors. The 3 groups that emerge in the upper portion of the barplot at K = 4 are evident in a PCoA of the data as well as a STRUCTURE analysis that is identical to this except that I only include sample sites with > 20 individuals.

Cheers,

Olly


On Friday, June 3, 2016 at 9:40:17 AM UTC+8, YC Tay wrote:
Hi Olly can you please attach the barplot in jpeg, pdf or a more common format?
On 3 June 2016 at 08:18, Olly Bolly <oliver...@csiro.au> wrote:
Hi Yujie,

Yes I know what you mean - often when there is no structure evident in the data you see K even sized stripes across the barplot for all values of K that you run.  However, this is a different situation since: a) I have other evidence for geographic structuring (PCA, Structure run  including only larger sample sizes); and b) that geographic structuring is reflected in the upper portion of the barplot.  

But I also attach barplots for K = 2 and K = 3 for your info.  Not much to see in K = 2, but K = 3 reverts to the problematic pattern.

Its not that easy to troubleshoot this sort of issue because of the long run time required for structure, so I was hoping somebody else had stumbled on it too ;-/.


Cheers,

Olly


On Thursday, June 2, 2016 at 9:55:47 AM UTC+8, YC Tay wrote:
Dear Olly Bolly,

What about the bar plots at K=2? If I'm not wrong the "across the board stripe" often comes up when there is no structure across your samples, or if the K is not optimal (usually too high). 

Cheers,
yujie

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-software+unsub...@googlegroups.com.
To post to this group, send email to structure...@googlegroups.com.
Visit this group at https://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-software+unsub...@googlegroups.com.
Location Prior K2-4.pdf

Vikram Chhatre

unread,
Jun 6, 2016, 11:17:02 PM6/6/16
to structure-software
OB:

What does your lnLK and deltaK suggest?  Can you post those plots?  Also, can you post your DAPC plot?

V

On Mon, Jun 6, 2016 at 8:08 PM, Olly Bolly <oliver...@csiro.au> wrote:
Hi,

Sure - sorry about that.  Here is a plot of K between 2 and 4 including location priors. The 3 groups that emerge in the upper portion of the barplot at K = 4 are evident in a PCoA of the data as well as a STRUCTURE analysis that is identical to this except that I only include sample sites with > 20 individuals.

Cheers,

Olly

On Friday, June 3, 2016 at 9:40:17 AM UTC+8, YC Tay wrote:
Hi Olly can you please attach the barplot in jpeg, pdf or a more common format?
On 3 June 2016 at 08:18, Olly Bolly <oliver...@csiro.au> wrote:
Hi Yujie,

Yes I know what you mean - often when there is no structure evident in the data you see K even sized stripes across the barplot for all values of K that you run.  However, this is a different situation since: a) I have other evidence for geographic structuring (PCA, Structure run  including only larger sample sizes); and b) that geographic structuring is reflected in the upper portion of the barplot.  

But I also attach barplots for K = 2 and K = 3 for your info.  Not much to see in K = 2, but K = 3 reverts to the problematic pattern.

Its not that easy to troubleshoot this sort of issue because of the long run time required for structure, so I was hoping somebody else had stumbled on it too ;-/.


Cheers,

Olly


On Thursday, June 2, 2016 at 9:55:47 AM UTC+8, YC Tay wrote:
Dear Olly Bolly,

What about the bar plots at K=2? If I'm not wrong the "across the board stripe" often comes up when there is no structure across your samples, or if the K is not optimal (usually too high). 

Cheers,
yujie

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To post to this group, send email to structure...@googlegroups.com.
Visit this group at https://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.

To post to this group, send email to structure...@googlegroups.com.
Visit this group at https://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.

Olly Bolly

unread,
Jun 7, 2016, 9:32:34 PM6/7/16
to structure-software
Hi Vikram,

 I've posted the Delta K (location prior) plot and also barplot of the analysis that included only sampling sites with > 20 individuals (also location prior). These are what I have at hand right now. I will find the other plots.

As you can see DeltaK is maximised at K = 2, though it's not a particularly high value. It also doesn't tell the full story.

The barplot also illustrates that there is credible population structure in the data - though also likely a region of admixture and isolation by distance. 
These samples follow a latitudinal gradient from north-south (left to right).

Thanks for your assistance with this. These runs take some time, so its not straightforward to run sensitivity analysis on the STRUCTURE settings and I was hoping that others might have had similar experiences.
Cheers,
Olly
popGT20.pdf
Best_K_By_Evanno-DeltaKByKGraph (1).png

Olly Bolly

unread,
Jun 19, 2016, 9:38:27 PM6/19/16
to structure-software
Hello again,

As a minor update, I have re-run this analysis with the minor modification of adding about 20 additional individuals from a single population.  Previously this dataset (without the 20 additional) had provided a regular barplot (shown above) with obviously discrete clusters that had a geographic basis. 

However, this time its recreated the original unusual pattern with all individuals assigned ~ 0.75 to a single cluster, but in their remaining ~0.25 assignment there is the obvious geographically sensible clustering. I attach the bar plot. Best delta K is 3, but I suspect that the odd clustering is playing havoc with that analysis. 

I'd previously yielded this result from a dataset that included many small populations, and I'd though that perhaps that contributed.  Now it seems more likely that its the sheer size of the dataset (about 6000 snps, around 900 individuals).

This situation seems like the opposite of the so-called "ghost clusters" where clusters have zero individuals assigned to them.  Here we have clusters that all individuals are assigned to, but without conveying the geographic information that clearly exists in the data.  Its curious.
All Lcarpo Best_K_By_Evanno-DeltaKByKGraph_Pooled Nt K1-8.png
Lcarpo_pastels.jpg

Olly Bolly

unread,
Oct 19, 2016, 5:59:05 AM10/19/16
to structure-software
Hello, 

I'm following up in case you've had any ideas about what is driving this barplot pattern.  I also attach a dapc result that shows quite clearly that there is some structure in the data, as well as an isolation by distance signal. The sites are coloured according to a rainbow scheme but north to south.

I have revised my thoughts that this relates to sample sizes - that was an error on my part. The same result arises no matter the number of samples per site.  Interestingly, if I repeat this analysis based on 66 "outlier" loci identified through outlier analysis and that presumably represent markers under selection, I get an identical pattern but without the large blue cluster across all sites - i.e. it is a more conventional barplot that reveals spatial structure.

Cheers,

Olly


To unsubscribe from this group and stop receiving emails from it, send an email to structure-software+unsub...@googlegroups.com.
To post to this group, send email to structure...@googlegroups.com.
Visit this group at https://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-software+unsub...@googlegroups.com.

To post to this group, send email to structure...@googlegroups.com.
Visit this group at https://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-software+unsub...@googlegroups.com.
Rainbow PCA from DAPC.png
Reply all
Reply to author
Forward
0 new messages