--
You received this message because you are subscribed to the Google Groups "Antiquist" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antiquist+...@googlegroups.com.
To post to this group, send email to anti...@googlegroups.com.
Visit this group at http://groups.google.com/group/antiquist.
For more options, visit https://groups.google.com/d/optout.
---
B. Lee Drake
Department of Anthropology
University of New Mexico
(505) 510.1518
b.lee...@gmail.com
#Compatibility
if(.Platform$OS.type=="windows") {
quartz<-function() windows()
}
--
--
Hi Ben,
Thanks for the extensive comments! The code were uploaded just for reference and partial (in the case of ABC, you need a cluster) reproducibility. So it is not currently on github or any version controlled repository (I should do this eventually). As for your comments:
* I'm not keen on rm(list=ls()) at the top of a script, it seems a bit unkind to the user. I think literate programming offers a better approach since it creates a new environment when the code is executed (ie. sweave or knitr), so it's free of contaminating data objects, but doesn't require the user to remove everything.
* Supplementary materials are not a very convenient way to share code and data... the journal renames many of your files so the internal references in the code need to be edited, your R script files are supplied as zip files which adds a few extra steps to get to the code, and your CSV files (as they are referred to in the code) are Excel xls/x files in the supplementary materials. I see you have a comment in the code that says 'the excel worksheets should be individually exported as .csv files', but this is a bit tedious for the poor user and an unnecessary obstacle to using the data. Better would be to have the CSV files as the actual files in the supplementary materials, since Excel files are a poor choice for data longevity and portability. Those are all pretty small details, but they add up to an obstacle to sharing the really important parts of the paper, and burden the reader with a lot of tedium to access the code and data. There are some nice guidelines on sharing code and data here: http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003542
* My suggestion to lower the burden in working with code and data that accompanies a paper is to put the code and data in an online repository that gives a persistent URL, and cite that URL in the paper. As I noted to Lee, many researchers are putting code and data on github, then archiving a specific commit of their github repository in a repository like figshare.com or zenodo.org and getting a DOI from there to cite in the paper. Those repositories are not going to mess around with file names or formats in the same way that the journal does, and it's easy for you to make corrections after publication, but still have a reference to the version current at the time of publication. You could have the usual supplementary material also, but the online repositories I'm referring to are also free to access, so people can get to them even if they don't subscribe to the journal (though I see your JAS papers are OA, so that's not a problem for those specific papers).
* You don't have any kind of license on your code, which makes it difficult for others to know how they can reuse it. I see that Lee used GPL, but I prefer MIT (http://opensource.org/licenses/MIT) because it's not viral like GPL. Most people think of licenses as something that's only relevant where there are commercial applications, but I think they're useful ways to formally communicate to your users about your intentions for reuse (do you want attribution? are you ok with commercial reuse or do you want to limit to non-commercial use?). Licenses are also handy to absolve you of any responsibility for how others use your code (eg. they misuse your code, publish, then it turns out they made a mistake and drama ensues, you can point to the license and say 'these are the conditions that you accept when you use my code, no warranty, no liability, etc.'). So I think all publicly available code should have a license, ideally one that is widely used (rather than one you make up yourself!).
* For your simulation code (which I haven't studied carefully, so I might be on the wrong track here), include a random seed value. If I understand this correctly, given the same initial seed, all random numbers used in an analysis will be equal, thus giving identical results every time it is run.
I've now finished running the code from the abc paper and got some errors at step 5. First some warnings, eg> UBpost1<-abc(target=observed,sumstat=simresUB,param=simparamUB,tol=0.01,method="rejection")Warning message:In abc(target = observed, sumstat = simresUB, param = simparamUB, :No summary statistics names are given, using S1, S2, ...then an error:> hist(UBpost1$unadj.values[,1])Error in UBpost1$unadj.values[, 1] : incorrect number of dimensionsAny thoughts about that?