Hi Austin,
How much memory does your system have, or, how much are you requesting from the scheduler? The populations program will attempt to load one chromosome worth of loci into memory at a time. Gstacks says it built ~3 million loci, so that is quite a lot it needs to load into memory. Your mean coverage is quite low (~5x) with a lot of PCR duplicates having been removed (91%), so this will reduce the power of the analysis. If you can’t add more memory, a modest filter is likely to remove a lot of the loci (e.g. --min-mac 3 or -r 0.5) which may enable the procedure to complete if you can’t get more memory.
Best,
julian
That should be more than enough memory. I would monitor the job with top or run it within the time program:
/usr/bin/time -v populations …..
Which will tell you its max memory usage prior to exiting.
--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
stacks-users...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/stacks-users/c329f8dd-adb8-476d-89fa-00123297f823n%40googlegroups.com.
Hi Austin,
Very interesting, I would not have expected that. You could try using the --batch-size [int] option to populations, which will limit the number of loci it tries to run per batch. By default, it will load one chromosome at a time into memory, but you can limit to 5 or 10k loci (or whatever) at a time, which will hopefully address the memory issue. This option is not super well tested for reference data (it is used regularly in de novo data) but it should work.