
I attached the files I generate by faststructure and the plot by
distruct2.2.py. I just find a minor issue. I found for some weaker signals, its color may not be consistent with others. Take the PEL as example, we found the second strong signal is from EUR (Europe, TSI, IBS, GBR, CEU, FIN) described as blue. But in the plot, it shows the second strong signal is from EAS (East Asian, CDX, KHV, CHS, CHB, JPT). Please see blow for the details. Generally, for PEL we found 83% signals are from one cluster (shown by purple), and 10% signals are from EUR and 2% signals are from EAS (shown by green color). We just wonder, can this from a bug of the code or
distruct2.2.py does not take care of the consistency of weak population signals. Thanks.
## Take FIN as example, and know cluster 1 ("V1") is described as blue.
#AG_data are the information from
"cluster.5.meanQ"FIN_AG_data=AG_data[pops=="FIN",]
> apply(FIN_AG_data,2,sum) #this get the sum of signals to each cluster
V1 V2 V3 V4 V5
90.085479 0.043275 1.810187 2.281317 4.779762
#take JPT as example, we know cluster 5 (i.e., "V5") is described as green
JPT_AG_data=AG_data[pops=="JPT",]
> apply(JPT_AG_data,2,sum)
V1 V2 V3 V4 V5
0.302159 0.049349 0.917564 0.637524 102.093420
#For PEL, we can see the sum of signals to cluster 1 ("V1") is the second strong, which should be described as blue, but according to the plot it was described as green.
PEL_AG_data=AG_data[pops=="PEL",]
> apply(PEL_AG_data,2,sum)
V1 V2 V3 V4 V5
8.764965 1.803427 0.525974 71.488103 2.417567