Hi everyone,
I have a question about inferring the number of genetic cluster present in my dataset using STRUCTURE in R. I have sampled individuals across 4 habitats and want to see how the populations from those habitats are related based on ancestry. After filtering of my dataset I end up with 1930 SNPs. I am using the dartR packages available since my samples are sequenced by Dartseq.
I have run my analysis using the following code:
out_struc <- gl.run.structure(
gl5,
k.range=1:5,
num.k.rep=10,
burnin = 1000, [default]
numreps = 1000, [default]
exec="./structure/structure",
noadmix=FALSE)
Using gl.evanno like below I get the graphs on meanLnP(K) and deltaK that I want to use to determine the correct number of clusters present (see graphs attached)
out_evanno <- gl.evanno(out_struc, plot.out=TRUE)
However, I am not entirely sure if these graphs are very informative or if I have to increase the burnin since meanLnP(K) is highest around 3/4 but my deltaK is highest around 2 so that confuses me.
I am currently running the same line of code wil a burnin of 20000 and numreps of 10000 but analyzing this takes a very long time with my computer and I’m still waiting to see these results. Are there any recommendations on what would be the best way to determine the best number of iterations for MCMC burnin (burnin) and the number of MCMC replicates (numreps) for my dataset? I have chosen K 1:5 since I have 4 different habitats/four potentially different populations but read that it's best to go to a higher K than 4 in that case?
Thank you very much for your help!
Kind regards,
Chiara

--
You received this message because you are subscribed to a topic in the Google Groups "structure-software" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/structure-software/J4PYKhTpVo0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to structure-softw...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/structure-software/0c560e43-0e13-4f76-9cef-cf232bebe147n%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/structure-software/d1ed7e57-615a-462a-9017-fb8243ce68c5n%40googlegroups.com.
You received this message because you are subscribed to a topic in the Google Groups "structure-software" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/structure-software/J4PYKhTpVo0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to structure-softw...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/structure-software/0a5c32ec-4331-4175-8f6c-4a3fd95d78fdn%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/structure-software/cc65c6d0-78a2-4554-aae1-d1fa4902599fn%40googlegroups.com.
MCMC completed
Inferred ancestry of individuals:
Label (%Miss) Pop: Inferred clusters
1 19.1 (0) 1 : 0.997 0.001 0.001 0.001 0.001
2 20 (0) 1 : 0.002 0.002 0.994 0.001 0.001
3 21 (0) 1 : 0.002 0.002 0.994 0.001 0.001
4 22.1 (0) 1 : 0.997 0.001 0.001 0.001 0.001
5 23 (0) 1 : 0.997 0.001 0.001 0.001 0.001
6 24 (0) 1 : 0.997 0.001 0.001 0.001 0.001
7 25.1 (0) 3 : 0.997 0.001 0.001 0.001 0.001
8 26 (0) 3 : 0.997 0.001 0.001 0.001 0.001
9 26.1 (0) 3 : 0.997 0.001 0.001 0.001 0.001
10 28 (0) 3 : 0.997 0.001 0.001 0.001 0.001
11 29 (0) 3 : 0.997 0.001 0.001 0.001 0.001
12 30 (0) 2 : 0.001 0.001 0.001 0.001 0.997
13 30.1 (0) 3 : 0.997 0.001 0.001 0.001 0.001
14 31.1 (0) 2 : 0.001 0.001 0.001 0.001 0.997
15 32.1 (0) 2 : 0.001 0.001 0.001 0.001 0.997
16 34 (0) 2 : 0.001 0.001 0.001 0.001 0.997
17 36 (0) 2 : 0.013 0.351 0.012 0.005 0.618
18 36.1 (0) 4 : 0.062 0.719 0.057 0.006 0.156
19 37.1 (0) 4 : 0.144 0.726 0.088 0.004 0.039
20 38.1 (0) 4 : 0.021 0.824 0.139 0.005 0.012
21 39 (0) 4 : 0.001 0.001 0.001 0.001 0.997
22 40.1 (0) 4 : 0.098 0.681 0.033 0.006 0.182
23 41 (0) 4 : 0.001 0.001 0.001 0.001 0.996
Command line arguments: C:\Installers\structure_windows_console\console\structure.exe -m gtypes.created.on.2023-06-15.11.38.57.structureRun/gtypes.created.on.2023-06-15.11.38.57.structureRun.k5.r10_mainparams -e gtypes.created.on.2023-06-15.11.38.57.structureRun/gtypes.created.on.2023-06-15.11.38.57.structureRun.k5.r10_extraparams -i gtypes.created.on.2023-06-15.11.38.57.structureRun/gtypes.created.on.2023-06-15.11.38.57.structureRun.k5.r10_data -o gtypes.created.on.2023-06-15.11.38.57.structureRun/gtypes.created.on.2023-06-15.11.38.57.structureRun.k5.r10_out
Input File: gtypes.created.on.2023-06-15.11.38.57.structureRun/gtypes.created.on.2023-06-15.11.38.57.structureRun.k5.r10_data
Output File: gtypes.created.on.2023-06-15.11.38.57.structureRun/gtypes.created.on.2023-06-15.11.38.57.structureRun.k5.r10_out_f
Run parameters:
23 individuals
1930 loci
5 populations assumed
20000 Burn-in period
100000 Reps
--------------------------------------------
Estimated Ln Prob of Data = -27501.2
Mean value of ln likelihood = -24743.4
Variance of ln likelihood = 5515.7
Mean value of alpha = 1.0000
Allele frequencies uncorrelated
--------------------------------------------
Proportion of membership of each pre-defined
population in each of the 5 clusters
Given Inferred Clusters Number of
Pop 1 2 3 4 5 Individuals
1: 0.665 0.001 0.332 0.001 0.001 6
2: 0.003 0.071 0.003 0.001 0.921 5
3: 0.997 0.001 0.001 0.001 0.001 6
4: 0.055 0.492 0.053 0.004 0.397 6
--------------------------------------------
Final results printed to file gtypes.created.on.2023-06-15.11.38.57.structureRun/gtypes.created.on.2023-06-15.11.38.57.structureRun.k5.r10_out_f
Completed: gl.run.structure

Chiara
To view this discussion on the web visit https://groups.google.com/d/msgid/structure-software/260bec20-0977-4a95-ac05-add475946a02n%40googlegroups.com.