Difference between PC-stable in R and Tetrad

164 views
Skip to first unread message

Mandar Chaudhary

unread,
Oct 30, 2014, 10:27:14 AM10/30/14
to tetrad-us...@googlegroups.com
Hello Dr. Ramsey and Dr. Glymour,

I am trying to build a causal model in Tetrad (alpha=0.05, D=-1) on a binary data set.
When I select the PC-stable algorithm to build the model, it keeps executing and does not run till completion.
However, when I tried to build causal model on the same data set in R using the following command, it runs successfully.

pc.result <- pc(suffStat.data, indepTest, labels=climIndices, alpha=0.05, skel.method="stable")

where suffStat.data <- list(binaryData, adaptDF=FALSE); binaryData is the input data and adaptDF is a logical specifying if the degrees of freedom should be lowered by 1 for each zero count.
indepTest <- binCItest; binCItest is a wrapper for G Square test which test conditional independence between two variables
labels <- variable names
skel.method="stable" <- build skeleton using its stable version
alpha=0.05

I haven't been able to find the reason for this since I believe that I am running the same algorithms in R and Tetrad, though R returns an output but Tetrad does not complete its execution. Am I missing something?

Thank you for your time.
Regards,
Mandar

Scott

unread,
Jan 12, 2015, 5:37:31 PM1/12/15
to tetrad-us...@googlegroups.com
Just curious how large the dataset is.  I have used the PC algorithm in both the pcalg R package and in Tetrad, and Tetrad isn't set up to handle very large datasets, so if it just doesn't finish you may be running out of memory or it just may be taking a long time.

Also, are you using raw data or a covariance matrix in Tetrad?  Again, if you have a large dataset, you will get much more mileage out of Tetrad using a covariance matrix.

Also, although they are both the PC algorithm, I wouldn't assume they are identically implemented.  I have noticed differences between the way the two packages handle constraints.  I also assume there are some small differences in the orientation phase of the algorithm between the two packages just based on what I've read in both sets of documentation, but I haven't actually compared the code in enough detail to confirm.  Been meaning to look at it more carefully, but haven't had the time.

Mandar Chaudhary

unread,
May 27, 2015, 4:00:33 PM5/27/15
to tetrad-us...@googlegroups.com
Scott,

I have a dataset of dimensions 57 x 211 and Tetrad's run time is not bad on my dataset. I am using the entire dataset and not the covariance matrix. I found a difference in their implementation which is, while considering variables for conditional independence, R does it based on the order of columns and Tetrad considers variables in a lexicographical order. However, I sorted the columns and then ran the PC-stable algorithm in Tetrad and R with solve.confl=TRUE and skel.method="stable" options set and and found that the output matches almost exactly. I found a difference of one edge between their outputs.
Reply all
Reply to author
Forward
0 new messages