Estimation of K is difficult in mixture models in general, and in
Structure in particular. The approach that Structure takes (ie
computing Pr[K|X] ) is theoretically justified, although the method that
Structure uses to compute this is approximate. In my experience I have
found that this method works well for simulated data with distinct
populations.
A more serious problem may be that in reality K is often not a very well
defined quantity. For example, human population structure across Europe
seems to fit an isolation-by-distance model pretty well, so if we had a
set of samples uniformly spread across Europe there would not be a
natural "correct" value of K. I think that this type of problem is
quite common.
Given these issues, I suggested in the Structure manual that people take
a relatively informal approach to evaluating K (in addition to the
Pr[K|X] criterion). I would consider higher values of K to be
justified if (i) the population assignments make biological sense--for
example if all K clusters include different proportions of individuals
from each sampling location, and (ii) all K clusters include at least
some individuals who are strongly assigned to that cluster. You
indicate below that your data meet both criteria for values of K of 5-7
which (although I have not seen your data) seems like an excellent
reason to report those results. In this case it seems likely that the
Evanno criterion is being overly conservative.
Incidentally, this issue that K is often not very well-defined (for
example due to isolation by distance or hierarchical structure) is one
major reason why I have generally preferred to plot results for a series
of values of K (see eg Rosenberg et al 2002), and in your case it would
seem natural to go up to ~7.
Jonathan
> I've read in several papers (e.g. Waples& Gaggiotti, Mol Ecol 2006,