To use the functions: sim(), getPower and findpower

162 views
Skip to first unread message

Hyeseung Koh

unread,
May 16, 2022, 10:33:38 PM5/16/22
to lavaan

Hello, I have been checking the prior postings and responses, which likely enables me to make some shifts in my code in a desirable way. I have been running sim() code using simulation data in order to use the functions, getPower and findpower. 

While running the sim(), I received an error message such that 

Error in lav_data_full(data = data, group = group, cluster = cluster,  :
  lavaan ERROR: grouping variable ‘re_edu’ not found;
  variable names found in data frame are:
  bi pin pdn dnc dn_n group
Error in vnames(lav, type = "ov.ord", group = group.values[g]) :
  lavaan ERROR: group column does not contain value `NA'
In addition: Warning message:
In (function (model = NULL, model.type = "sem", meanstructure = FALSE,  :
  lavaan WARNING: some regression coefficients are unspecified and will be set to zero

It is a multi-group path analysis consisting of 1 continuous dependent variable, 3 continuous mediators and 1 nominal independent variable. The grouping variable is "re_edu” with two categories in the dataset. 

Here is the code I was using. Please help me let me know how to correct the code in order to work it out.

library(readxl)
nfscm <- read_excel("~/Desktop/numeric.xls")
View(nfscm)
nfscm <- data.frame(nfscm)

#population model

pop.model.numeric.low <- "

# regression part

dv ~ .482*med3 + .437*med2 + -.417*med1 + .085*iv

med3 ~ .443*med2

med2 ~ .778*med1

med1 ~ .606*iv

# variance part 

dv ~~ 0.622*dv

med3 ~~ 0.804*med3

med2 ~~ 0.395*med2

med1 ~~ 0.633*med1"


pop.model.numeric.high <- "

# regression part

dv ~ .04*med3 + .693*med2 + -.346*med1 + .278*iv

med3 ~ .523*med2

med2 ~ .701*med1

med1 ~ .403*iv

# variance part 

dv ~~ 0.603*dv

med3 ~~ 0.726*med3

med2 ~~ 0.509*med2

med1 ~~ 0.837*med1"

 

##Simulate population models for each group and combine into dataframe

pop.data.low <- simulateData(pop.model.numeric.low, sample.nobs = 100000 )

pop.data.high <- simulateData(pop.model.numeric.high, sample.nobs = 100000)

pop.data <- rdvnd(pop.data.low, pop.data.high)

pop.data <- data.frame(pop.data, group = rep(c(0,1), each = 100000))

 

#analysismodel

model.numeric <- "

group:0 #group low

#regression part

dv ~ med3 + med2 + med1 + iv

med3 ~ med2

med2 ~ med1

med1 ~ iv

# variance part 

dv ~~ dv

med3 ~~ med3

med2 ~~ med2

med1 ~~ med1

 

group:1 #group high

#regression part

dv ~ med3 + med2 + med1 + iv

med3 ~ med2

med2 ~ med1

med1 ~ iv

# variance part 

dv ~~ dv

med3 ~~ med3

med2 ~~ med2

med1 ~~ med1"

Output <- sim(2, model = model.numeric, n = list(48, 48), rawData = pop.data, data = nfscm, group = "re_edu", std.lv = TRUE, lavaanfun = "lavaan") #testing with small replications 

 Output <- sim(10000, model = model.numeric, n = list(48, 48), rawData = pop.data, data = nfscm, group = "re_edu", std.lv = TRUE, lavaanfun = "lavaan") 

summary(Output)


power.edu1 <- getPower(Output, alpha = 0.05)

power.edu2 <- getPower(Output, nVal = c(48, 59))

 

summary(power.edu1)

summary(power.edu2)

 

power.edu1 <- data.frame(power.edu1)

power.edu2 <- data.frame(power.edu2)

             

findPower(power.edu1, "N", 0.80)

findPower(power.edu2, "N", 0.80)

Hyeseung Koh

unread,
May 17, 2022, 12:20:29 AM5/17/22
to lavaan
I need to correct a thing in the error message based on the code. Please take into account the underlined part.

Error in lav_data_full(data = data, group = group, cluster = cluster,  : 
  lavaan ERROR: grouping variable ‘re_edu’ not found;
  variable names found in data frame are:
  dv med3 med2 med1 iv group

Error in vnames(lav, type = "ov.ord", group = group.values[g]) : 
  lavaan ERROR: group column does not contain value `NA'
In addition: Warning message:
In (function (model = NULL, model.type = "sem", meanstructure = FALSE,  :
  lavaan WARNING: some regression coefficients are unspecified and will be set to zero


2022년 5월 16일 월요일 오후 9시 33분 38초 UTC-5에 Hyeseung Koh님이 작성:

balal izanloo

unread,
May 17, 2022, 2:49:41 AM5/17/22
to lav...@googlegroups.com
Hi
the column name of your data should be what you have used in the commands. maybe you need to select some columns from the original data. Your problem is related to basic commands in R.

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/e0104f4b-6895-406d-96ad-27c2815bb278n%40googlegroups.com.

Hyeseung Koh

unread,
May 17, 2022, 2:56:01 AM5/17/22
to lav...@googlegroups.com
Hello. I used the exact name of each column in the dataset and the data.frame for constructing the code. I am not clearly understand whether, to use lavaan and/or simsem, it is necessary to select the columns in the dataset or data.frame that are used in the R code for an analysis? Then, I can remove all other columns except for the needed columns. Is it a way to correct the selection of the columns for the analysis?

balal izanloo

unread,
May 17, 2022, 3:00:39 AM5/17/22
to lav...@googlegroups.com
if you can send for me your data along with your commands

Hyeseung Koh

unread,
May 17, 2022, 3:03:01 AM5/17/22
to lav...@googlegroups.com
I can post the screenshot of my variable names in the original data and send the code. Is is working for you? 

balal izanloo

unread,
May 17, 2022, 3:06:35 AM5/17/22
to lav...@googlegroups.com
it is better to send for me the first 10 rows (not all of your data) from your data with you commands 

balal izanloo

unread,
May 17, 2022, 3:07:54 AM5/17/22
to lav...@googlegroups.com
it is better to send for me the first 10 rows (not all of your data) from your data along with yous commands

Hyeseung Koh

unread,
May 17, 2022, 3:16:29 AM5/17/22
to lav...@googlegroups.com
I can generate random numbers to make a dataset for this case. I excluded other variables except for the needed variables in the file. 

2022-05-17 numeric.xlsx

Hyeseung Koh

unread,
May 17, 2022, 3:39:35 AM5/17/22
to lav...@googlegroups.com
Earlier, I copied and pasted my R code from the R studio but there was a shift from "pop.data <- rbind(pop.data.low, pop.data.high)” to "pop.data <- rdvnd(pop.data.low, pop.data.high).”

Please refer to the former one in this correction: pop.data <- rbind(pop.data.low, pop.data.high)


On May 17, 2022, at 2:16 AMCDT, Hyeseung Koh <thedinosa...@gmail.com> wrote:

I can generate random numbers to make a dataset for this case. I excluded other variables except for the needed variables in the file. 

<2022-05-17 numeric.xlsx>

balal izanloo

unread,
May 17, 2022, 4:11:14 AM5/17/22
to lav...@googlegroups.com
The first problem is related to the green part in your commands. First tell what package you have used for the sim function?

Output <- sim(2, model = model.numeric, n = list(48, 48), rawData = pop.data, data = nfscm, group = "re_edu", std.lv = TRUE, lavaanfun = "lavaan") #testing with small replications

Hyeseung Koh

unread,
May 17, 2022, 4:14:35 AM5/17/22
to lav...@googlegroups.com
library(lavaan)
library(simsem)

were used in the R studio. 

Hyeseung Koh

unread,
May 17, 2022, 4:23:35 AM5/17/22
to lav...@googlegroups.com
Do you mean that the rawData is replaced with generate? Please explain more about how they function distinctively in the code. 

The data=nfscm could be removed in the code as I have observed in multiple codes. 

What happened to the group = “re_edu”?

Hyeseung Koh

unread,
May 17, 2022, 5:37:09 AM5/17/22
to lavaan
To be clear, I sent my data along with my code for this case. 

2022년 5월 17일 화요일 오전 2시 0분 39초 UTC-5에 b.ez...@gmail.com님이 작성:

Hyeseung Koh

unread,
May 20, 2022, 4:49:41 AM5/20/22
to lavaan

In this thread, I received this corrected code below and got an error message for each group. 

When running this part, respectively, R showed the following error message, respectively: 

> findPower(power.edu1, "N", 0.80)

Error in powerTable[, ivCol] : incorrect number of dimensions

> findPower(power.edu2, "N", 0.80)

Error in powerTable[, ivCol] : incorrect number of dimensions

When I put the power.edu1 to data.frame, power.edu1 <- data.frame(power.edu1), which I observed in this Google group, and then ran the code, findPower(power.edu1, "N", 0.80), it showed:

> power.edu1 <- data.frame(power.edu1)

> findPower(power.edu1, "N", 0.80)

Error in findPower(power.edu1, "N", 0.8) : 

  Cannot find the specified target column

> power.edu2 <- data.frame(power.edu2)

> findPower(power.edu2, "N", 0.80)

Error in findPower(power.edu2, "N", 0.8) : 

  Cannot find the specified target column

The findPower works when I use Output1 <- sim(nRep = NULL, model = pop.model, n=100:500, generate = pop.model.numeric.low, std.lv=TRUE, lavaanfun = "lavaan",seed="free number") while it did not work when I shifted the nRep = "free number" from nRep = NULL, and n="free number" from n=100:500. To shift the number of repetition, I need to change the format of n. 

Is this Output11 <- sim(nRep = "free number", model = pop.model, n="free number", generate = pop.model.numeric.low, std.lv=TRUE, lavaanfun = "lavaan",seed="free number")  available code for this analysis in R? 

Would you let me know how to make the findPower work with Output11 <- sim(nRep = "free number", model = pop.model, n="free number", generate = pop.model.numeric.low, std.lv=TRUE, lavaanfun = "lavaan",seed="free number") when it is available code for this analysis in R?

Here is the code I used. 

library(lavaan)

library(simsem)

#data 

random_numeric

#################### population model

pop.model <-"

#regression part

dv ~ med3 + med2 + med1 + iv

med3 ~ med2

med2 ~ med1

med1 ~ iv

# variance part 

dv ~~ dv

med3 ~~ med3

med2 ~~ med2

med1 ~~ med1

"

################# data generation based on the parameters for low group

#(i think parameters are from your real data so your data contribute to random sample generation)

pop.model.numeric.low <- 

"# regression part

dv ~ .482*med3 + .437*med2 + -.417*med1 + .085*iv

med3 ~ .443*med2

med2 ~ .778*med1

med1 ~ .606*iv

# variance part 

dv ~~ 0.622*dv

med3 ~~ 0.804*med3

med2 ~~ 0.395*med2

med1 ~~ 0.633*med1"

Output1 <- sim(nRep = NULL, model = pop.model, n=100:500, generate = pop.model.numeric.low,

std.lv=TRUE, lavaanfun = "lavaan",seed="free number")

Output11<- sim(nRep = "free number", model = pop.model, n="free number", generate = pop.model.numeric.low, std.lv=TRUE, lavaanfun = "lavaan",seed="free number")

summary(Output1)

summary(Output11)

############################### power for different parameters and sample size in Output1 

power.edu1 <- getPower(Output1)

power.edu11 <- getPower(Output11)

power.edu1

power.edu11

getPower(Output1, nVal = 150)

getPower(Output11, nVal = 150)

findPower(power.edu1, "N", 0.80)

findPower(power.edu11, "N", 0.80)

##############################  data generation based on the parameters for high group

#(i think parameters are from your real data in high group so your data contribute to random sample generation)

pop.model.numeric.high <-"     

# regression part

dv ~ .04*med3 + .693*med2 + -.346*med1 + .278*iv

med3 ~ .523*med2

med2 ~ .701*med1

med1 ~ .403*iv

# variance part 

dv ~~ 0.603*dv

med3 ~~ 0.726*med3

med2 ~~ 0.509*med2

med1 ~~ 0.837*med1"

Output2 <- sim(nRep = NULL, model = pop.model, n =100:500, generate = pop.model.numeric.high,

std.lv=TRUE, lavaanfun = "lavaan",seed="free number")

Output22<- sim(nRep = "free number", model = pop.model, n="free number", generate = pop.model.numeric.high, std.lv=TRUE, lavaanfun = "lavaan",seed="free number")

summary(Output2)

summary(Output22)

######################power for different parameters and sample size in Output2

power.edu2 <- getPower(Output2)

power.edu22 <- getPower(Output22)

power.edu2

power.edu22

getPower(Output2, nVal = 150)

getPower(Output22, nVal = 150)

findPower(power.edu2, "N", 0.80)

findPower(power.edu22, "N", 0.80)

In addition, I would like to make sure whether this is for the power analysis for a multi-group analysis. Otherwise, may I ask which parts would need to be corrected? 


2022년 5월 17일 화요일 오전 4시 37분 9초 UTC-5에 Hyeseung Koh님이 작성:

Terrence Jorgensen

unread,
May 23, 2022, 1:43:11 PM5/23/22
to lavaan
There is a preprint with supplemental materials (including syntax examples using simsem) here: https://osf.io/mpd74/

It may be helpful because it demonstrates different approaches for mediation models (single- vs. multiple-group models) in which an exogenous variable is categorical.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Hyeseung Koh

unread,
May 26, 2022, 5:29:19 AM5/26/22
to lav...@googlegroups.com
I downloaded it to check the technique. Like the template for the model 7, there could be an available template for a mediation analysis with multiple mediators and a categorical independent variable and a multi-group analysis for it. I have been checking the techtutorial file. In the first template for a mediation model with binary X  and M and Y as outcomes, I copied and pasted the code and ran the following command:

fit1 <- sem(mod1, data = dat) # lavaan function
parameterEstimates(fit1, output = "pretty)

The output results in my R studio are not identical with the results in the techtutorial file.

Accordingly, when I ran the following command: 

library(semTools)
set.seed(123
monteCarloCI(fit1)

The output results in my R studio are also not identical with the results in the techtutorial file.

I have encountered further non-identical results when running command after copying and pasting them from the techtutorial file. 

I am not sure whether there is something working not well in my R studio so I reran the command after turning it off and on. However, it still shows non-identical output results. Would you make sure whether the output resutls in the document correspond to the command?


--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages