Dear Vaibhav,
Let me try to address your questions in order:
1 What is the significance of replicates (-R) in structure threader, and does it have any dependencies with NUMREPS? How many replicates are absolutely necessary? Will this replicate's value Let's say 10000 be the same as NUMREPS 10000?Can I skip (or put -R 1), when I have a fixed K number for my every analysis? or should I do --no_tests 1?
The variable NUMREPS
in the mainparams
file, refers to the number of MCMC iterations in each STRUCTURE
run. The number of required iterations varies with the dataset,
it’s hard to provide a straightforward answer. But I recommend
taking a close look at section 3.3 of STRUCTURE’s
manual. The -R
option in Structure_threader refers to how many runs
you want to perform for each value of K
(you need multiple runs to estimate the “best” K
using Evanno’s method).
2 See the picture attached SNAP.png, when using replicates -R = 1 and K's = 16,17 and 18, the structure threader only runs in a single thread for each 16,17,18 K's for 10000 NUMREPS. so in total only using 3 threads. What do I do so that the threader can use all the threads available?
Structure_threader
is a wrapper program for STRUCTURE. As such it cannot make
single runs work with multiple cores (only STRUCTURE would be
able to do that). What you can do to leverage your multicore
system is run more values of K
and/or add replicates with -R
.
3 I understand that each line shown in the Python console (in picture SNAP.png), is running 10000 NUMREPS in the background (Meaning that the structure threader is using structure in the background ). Is it in any way possible to see each NUMREPS in the Python console, as we were able to see via the structure program? ( attached picture structure.png)
Although you can’t
look at the NUMREPS information “live” as in STRUCTURE, you can
run Structure_threader with --log 1
option which will create a log file with that information for
each replicate that gets run.
Hope this helps,
Best,
Francisco
--
You received this message because you are subscribed to a topic in the Google Groups "structure-software" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/structure-software/E_zevPCM-a8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to structure-softw...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/structure-software/981cc395-32e0-429e-9a94-99eb0f833825n%40googlegroups.com.
Dear Vaibhav,
Thank You for your detailed answers, they have been very helpful.Along with these current answers, I appreciate fixing the previous issues, as in the new analysis I have been able to run the program smoothly now. The combination of parameters are working very well.
Happy to know you have made some progress.
With this, I am successfully able to continue my analysis, except some errors I received. I am attaching those errors here. Can you have a look at them?
I had never seen Structure_threader fail only on specific replicates of a repeated analysis…
From the screenshots it seems you were able to perform all runs simultaneously on your machine. K18 replicates are likely to have been the last to finish. Since they errored out and no log was written I have only one suspicion. Perhaps you have run out of storage space on the machine? If you are 100% sure you did not, can you please share the command you used to run Structrure_threader?
Also, can you explain more about, How only STRUCTURE can be able to make a single run use multi-core? I tried to use only STRUCTURE and it is using only a single core.
I am not aware of any way to do this. All the wrappers I know (strauto, parallelstructure) do the same as Structure_threader - make each separate and independent launch in a different OS thread. Allowing STRUCTURE to paralelize independent runs would probably require re-writing STRUCTURE to support it.
Best,
Francisco
To view this discussion on the web visit https://groups.google.com/d/msgid/structure-software/759f6127-b81d-42ff-9646-260b9dc02595n%40googlegroups.com.
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/structure-software/048f08e7-d581-4e01-a98b-b8ee1f62e1dc%40gmail.com.
Hi again Vaibhav,
1) I am 100% sure that I did not run out of storage. I have 128 GB RAM with 1 TB of storage, and in my experience, until now structure_threader has not used any more than 10-12% of RAM ever in my analysis yet. I am attaching the command I have used in the file ST_CMD.
STRUCTURE won’t typically stress RAM memory usage, so what you are seeing is all within expectations. What I was suggesting was running out of storage space, not RAM memory. But let’s take a deeper look.
2) With all the fixes, I went ahead and started my full analysis, with 6874 individuals from 137 populations, and 68436 loci, where 24 animals have POPDATA 0 and POPFLAG 0. I received the error of "Error in assigning memory (not enough space?)" (See picture Error 4 4)(also ATTCHAED LOG FILE, including COMMANDS in K137_rep1.stlog). I tried to check the memory usage, at the time of the structure_threader run, and I saw that it is only taking ~2 GB of space.
Despite STRUCTURE complaining of memory problems, browsing the list suggests that when that message comes up it is related to unusual input file issues. Can you try to replace POPDATA 0 and POPFLAG 0 with a < 0 integer, and see if that solves it?
3) When running structure_threader, the Python console displays the file locations and other parameters. In the console, I found a parameter -D, which I have not provided. Can you explain what is it? (See picture Error 4 5)
The -D
parameter is the seed. If you don’t provide one, Structure_threader
will automatically create one for you to make sure your work is
reproducible.
Just to make sure that I understand the issue, I ran the structure standalone, and I received the same error, so I, Believe that the issue is based on structure only. (see picture Error 4 2). I also checked that it is not the issue of the Python host I am using, even the Windows command prompt output is the same. (see picture Error 4 3)
Yes, it looks that way. Hopefully setting a POPFLAG/POPDATA value < 0 solves it. Otherwise I'm kind of at a loss.
I know about fastStructure, for large datasets, but I am not sure if that is the solution to this issue.
It is a workaround, at best, since fastStructure does not have the same kind of features as STRUCTURE (such as providing POPDATA/POPFLAG information).
Best,
Francisco
To view this discussion on the web visit https://groups.google.com/d/msgid/structure-software/a8fcf726-d99b-4427-97dd-dade67fb11bcn%40googlegroups.com.