So let's see if I understand this correctly.... I am using Structure to identify populations in a set of 100 individuals. I have performed ten iterations of identical runs using k values from 1 to 10 and then employed the Evanno method in StructureHarvester to find the best K -value (i.e the likely number of populations from which my samples were drawn). Now I wish to run Clumpp to infer memberships of each of my samples in the populations inferred by the Evanno method. To do this I need to provide a popfile which - as far as I can tell - summarizes the membership proportions of (presumably) "populations" in each of the bestK clusters. However, as far as I can tell after a few hours searching, there is nowhere online that explains the content of the popfile file.
The StructureHarvester documentation is no help at all because it suggests that the popfile has identical structure and content to the indfile, except for an extra column whose contents are not explained at all. This makes no sense at all - why is one file called an indfile and the other a popfile when they both contain "ind" information? Also, please don't tell me that it's not necessary to understand the popfile structure because StructureHarvester outputs a popfile - it doesn't. From what I gather, StructureHarvester WILL output a popfile if one provides LOCDATA/POPDATA (its not clear which) when Structure is invoked. In my situation, however, there is no a priori POPDATA or LOCDATA - and I can imagine in many other situations there would also be no such data. Why, then, would CLUMPP require a popfile as input when in many (most?) cases, there is no a priori population data, so the inclusion of popfile data would serve no purpose?