simulations

49 views
Skip to first unread message

Hilary Parker

unread,
Oct 24, 2012, 1:27:23 AM10/24/12
to project...@googlegroups.com
I have an idea for how to modify ProjectTemplate to make it more simulation friendly. Please correct me if I have some misunderstanding of the architecture, but I think this could be relatively easy to add in!

The problem I ran into in my analysis is that I simulate a large dataset at least 100 times. So I don't run into memory problems, I run an analysis on the simulated dataset, store only the results (p-values, etc), then clear out the simulated data and start again. Once I've run all 100 I then go on to create graphs of the p-values, etc.  As a result, I don't think the simulation code should live in the "src" directory since that directory has code that runs on the results of the simulations, and since as I understand it the src directory should have code that can all be run in parallel.

My work-around for the time being is to just put my simulation code into the "munge" directory and this (I think!) will work well. If someone wants to go back and rerun the time-intensive simulations they can do so, but usually when loading the project it'll just load the cached results. However if there was a "simulation" directory (with a corresponding "simulations: on/off" option in the config file) I think it'd be a bit clearer both for users and readers of the code. Also this infrastructure would be good for people who run MCMCs or bootstraps or other really time-intensive validation stuff that doesn't quite fit into the 'src' directory because of parallel issues.

Let me know what you think! I would fork the project myself but I have nooo idea how to code the config file stuff.
Reply all
Reply to author
Forward
0 new messages