Dear SpaDES users,
As our case studies expand, we are facing the phases of growth. The question now:
To maintain reproducibility and version control as we expand to new cases, how do we organize a project when modules come from multiple sources?
One solution has been to use "links" locally on a machine. I have not liked this method because it is not machine transferable, i.e., whatever scripts I put together on my Windows machine will not work on my Linux machine.
Another suggestion is to use this folder structure:
MyProject - modules - module1
- module2
- module3
- module4
- ...
- inputs -
- outputs - output1
- output2
- ...
- cache - cache1
- cache2
- ...
When the project has all self contained modules, then this works as is, use a single GitHub.com repository for the whole project.
When the project has modules from other people or sources, then we maintain this structure, but we use a combination of .gitignore and "nested" git repositories, example below:
MyProject - modules - module1
- module2 (.gitignored in MyProject, git clone fork of PredictiveEcology/module2)
- module3
- module4 (.gitignored in MyProject, git clone fork of amc/module4)
- ...
- inputs -
- outputs - output1
- output2
- ...
- cache - cache1
- cache2
- ...
Any thoughts?