model choice

547 views
Skip to first unread message

Li

unread,
Dec 9, 2014, 12:51:15 PM12/9/14
to pop...@googlegroups.com
Hi everyone,

I performed the model_choice.r. But the R reported the errors as below:

> res.topol<-calmod(target,abc.rej[,7],abc.rej[,12:20],1,rej=F)
error: vglm.fitter(x = x, y = y, w = w, offset = offset, Xm2 = Xm2,  : 
  80 parameters but only 72 observations

My analysis includes 11 parameters and 9 summary statistics. How can I define the "abc.rej[,12:20]"?
Thanks!
Li
test.rej
test.trg

Joao Sollari Lopes

unread,
Dec 10, 2014, 6:28:06 AM12/10/14
to pop...@googlegroups.com
Hi Li,

The error you mention means that the regression model you are using has 80 parameters, but you only have 72 observations (i.e. data). You will need to use much more simulations to perform model-choice.
The columns of the summary statistics are well defined (abc.rej[,12:20]), but the column of the parameter to estimate is not correct (abc.rej[,7]). You should choose column 1, this is the column where the marker of the model is stored, and it is expect to have more than one value. Your .rej file only identifies one model (value = 1).
Furthermore, I plotted an histogram of your summary statistics and superimposed the summary statistics of your target, and they are not even close the histogram distribution. The regression performed using model_choice.r script only works (and should only be used) if the simulated data is somewhat similar to the target.

Hope this helps,
Joao

Li

unread,
Dec 10, 2014, 1:41:58 PM12/10/14
to pop...@googlegroups.com
Hi Joao,

Thank you very much for your help. Your suggestions are very helpful. I performed migration and no migration models (please see attachments) and the R reported new errors. 

Frankly, I do not know what is the meaning of  "abc.rej[,1]==1". My understanding is the script will input the two different models. So, should I combine the two .rej files into a single one? Then, I named the topology as 1 and 2?

top1<-length(which(abc.rej[,1]==1))/length(abc.rej[,1])

top2<-length(which(abc.rej[,1]==2))/length(abc.rej[,1])

Thanks a lot! I am look forward to receiving your responses!
Best,
Li
popabc test.zip

Joao Sollari Lopes

unread,
Dec 11, 2014, 6:47:33 AM12/11/14
to pop...@googlegroups.com
Hi Li,

Here comes a long answer.

1. First, I think you need to learn a bit of R to understand what you're doing. I assure you that it will be very helpful in the long run.

2. You should have used option 3 for the topologies so that you could have chosen a different marker for the two different topologies. Then you should join the .dat files (it can be with popabc tools) and perform  the rejection step on this joint file.

3. There is a problem with "calmod.r", a file that is used by "model_choice.r". Similar to the problem with "loc2plot_d.r" (group thread), you should change "predict.vglm" to "predict" since the previous format is no longer used.

4. Here is what I did.
replaced the topology marker from 1 to 2 (in the terminal):
sed 's/^1/2/g' migration.dat > migration.mod.dat

joint both files (in the terminal):
cat nomigration.dat migration.mod.dat > jointdata.dat

performed rejection-step (tol = 5000):
rejection.exe jointdata.dat target.trg output 11 9 0.025

run a modified version of "model_choice.r" and "calmod.r" to perform model choice.

5. I am sending you the modified files, the results and the plots of the results.

Hope this helps. Here is a useful links for future analysis:
http://cran.r-project.org/
Check out the Manuals in the Documentation section.

Best,
Joao
model_choice.mod.r
calmod.mod.r
regression.pdf
rejection.pdf
modelprob_reg.txt
modelprob_rej.txt

Li

unread,
Dec 11, 2014, 4:29:47 PM12/11/14
to pop...@googlegroups.com
Dear Joao,

Thank you so much for all you have done. I performed the analyses using the old data according to your suggestion. It works well. Thanks!

Then, I reanalyzed the data as you suggested:
1) use option 3 for the topologies, then I got "1" and "2" as the markers;
2) simulate the two models separately and got "migration.dat" and "no migration.dat";
3) I combined the two ".data" files and then performed the rejection.exe with the command: rejection.exe jointdata.dat target.trg output 12 9 0.025

However, the R reported errors when I performed the "model_choice.r". I also tried the "model_choice.mod.r" that you have modified. It still does not work. I am sorry for bothering you again. 

Best,
Li

> #import Mark Beaumont's scripts
> source("calmod.mod.r") #commented JSL 2014-12-11
>
> #import the .rej file
> abc.rej <- data.matrix(read.table("test.rej"))
> #import the .trg files
> target <- data.matrix(read.table("test.trg"))
> #Bar plot for model-choice (rejection)
> top1<-length(which(abc.rej[,2]==1))/length(abc.rej[,2]) #modified JSL 2014-12-11
> top2<-length(which(abc.rej[,2]==2))/length(abc.rej[,2]) #modified JSL 2014-12-11
> print(c(top1,top2))
[1] 0.556 0.444
> write(c(top1,top2),"modelprob_rej.txt")
> barplot(c(top1,top2),names=c("Model1","Model2"))
> abline(h=1.0/2,col="red")
> #Bar plot for model-choice (regression)
> res.topol<-calmod(target,abc.rej[,2],abc.rej[,13:21],1,rej=F) #modified JSL 2014-12-11
error: vglm.fitter(x = x, y = y, w = w, offset = offset, Xm2 = Xm2,  : 
  vglm only handles full-rank models (currently)
 
> print(res.topol$x2)
error: print(res.topol$x2) : 
can not find 'res.topol'
> write(res.topol$x2,"modelprob_reg.txt")
error: cat(x, file = file, sep = c(rep.int(sep, ncolumns - 1), "\n"),  : 
  can not find 'res.topol'
> barplot(res.topol$x2,names=c("Model1","Model2"))
error: barplot(res.topol$x2, names = c("Model1", "Model2")) : 
  can not find 'res.topol'
> abline(h=1.0/2,col="red")
> print("model-choice done.")
[1] "model-choice done."
calmod.mod.r
calmod.r
migration.dat
model_choice.mod.r
model_choice.r
nomigration.dat
test.rej
test.trg
Reply all
Reply to author
Forward
0 new messages