userGWAS error: unequal length of vector

247 views
Skip to first unread message

Austin Park

unread,
Jun 7, 2022, 9:21:49 AM6/7/22
to Genomic SEM Users
Hello all,

While running userGWAS function for a factor in the hierarchical model, I have encountered the error below. Would there be a solution to this error?

Loading required package: GenomicSEM
There were 24 warnings (use warnings() to see them)
[1] "Please note that an update was made to userGWAS on 11/21/19 so that it combines addSNPs and userGWAS."
[1] "Starting GWAS Estimation"
Error in .userGWAS_main(i, int, n_phenotypes, n, I_LD, V_LD, S_LD, std.lv,  :
  task 1 failed - "'names' attribute [23] must be the same length as the vector [21]"
Calls: userGWAS -> %dopar% -> <Anonymous>
Execution halted


Below is the code that I ran and I have attached ldsc output and sumstat output RData to reproduce the error. Sumstat output contains only the first 1,000 SNPs.

require(GenomicSEM)

ldsc.out <- "M_ldscout.RData"
sumstat.out <- "M_sumstat_subset.RData"

load(ldsc.out)
load(sumstat.out)

model <- "F1 =~ NA*V8 + V7 + V6 + V3
        F2 =~ NA*V1 + V2 + V6 + V5 + V3
        F3 =~ NA*V4 + V5 + V6
        M =~ NA*F1 + F2 + F3
        M ~~ 1*M
        V5 ~~ a*V5
        a > 0.001
        M ~ SNP
        "


MetS.GWAS <- userGWAS(covstruc=M_ldscout,
                          SNPs=M_sumstat.subset,
                          estimation="DWLS",
                          model=model,
                          cores=20,
                          toler=1e-50,
                          SNPSE=FALSE,
                          sub=c('M ~ SNP'),
                          parallel=TRUE,
                          GC="standard",
                          MPI=FALSE,
                          smooth_check=TRUE,
                          printwarn=FALSE)

Best,

Austin

M_sumstat_subset.RData
M_ldscout.RData

Austin Park

unread,
Jun 7, 2022, 9:21:06 PM6/7/22
to Genomic SEM Users
Additionally, I have checked that the same error appears when running GWAS for F1, F2, and F3!

Austin Park

unread,
Jun 8, 2022, 3:59:01 AM6/8/22
to Genomic SEM Users
 I have figured out that the error seems to rise while running below code at line 185 in userGWAS.R

LavModel1 <- .userGWAS_main(i=1, cores=1, n_phenotypes, n=1, I_LD, V_LD, S_LD, std.lv, varSNPSE2, order, SNPs2, beta_SNP, SE_SNP, varSNP, GC,
      coords, smooth_check, TWAS, printwarn, toler, estimation, sub, Model1, df, npar, returnlavmodel=TRUE)


then the code stops at line 102 in  userGWAS_main.R.

    if (returnlavmodel) {
        return(Model1_Results)
    }

And I'm stuck from here.

Elliot Tucker-Drob

unread,
Jun 9, 2022, 11:32:55 PM6/9/22
to Austin Park, Genomic SEM Users
Hi Austin,
It looks like you don’t have minimal identification constraints for your lower order factors. Try removing their NAs. 

--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/genomic-sem-users/4e51cb00-fd46-4b7e-a101-ada64b91fabdn%40googlegroups.com.

Austin Park

unread,
Jun 10, 2022, 2:40:40 AM6/10/22
to Genomic SEM Users
Hello,

Thank you very much for the suggestion!
But could you elaborate on what you meant by removing NAs for the lower order factors?

I understood in a way that I should include constraints for the lower order factors like below, but it still gives the same error.


model <- "F1 =~ NA*V8 + V7 + V6 + V3
        F2 =~ NA*V1 + V2 + V6 + V5 + V3
        F3 =~ NA*V4 + V5 + V6
        M =~ NA*F1 + F2 + F3
        M ~~ 1*M
        V5 ~~ a*V5
        a > 0.001
        M ~ SNP
        F1 ~~ b*F1
        F2 ~~ c*F2
        F3 ~~ d*F3
        b > 0.001
        c > 0.001
        d > 0.001
        "

Thank you!

Austin.

Elliot Tucker-Drob

unread,
Jun 10, 2022, 9:32:58 AM6/10/22
to Austin Park, Genomic SEM Users
Remove the NA* for F1-F3. You need the first loadings set to 1 for identifying the scale of those factors. 

We always recommend verifying that your model works without snps first. 

Elliot M. Tucker-Drob

Sent from my mobile device. 

On Jun 10, 2022, at 12:40 AM, Austin Park <sh.aust...@gmail.com> wrote:

Hello,

Austin Park

unread,
Jun 18, 2022, 6:33:59 AM6/18/22
to Genomic SEM Users
Hi,

Thank you for the remark!
By using unit loading identification, the userGWAS seems to work!

But I have another question regarding unit variance and unit loading identification for the model without SNPs. Shouldn't the model fit generated have identical values whether I use unit variance or unit loading identification? But I'm getting slightly different results.

This is how I specified the hierarchical model for unit variance identification.
model <- "F1 =~ NA*V7 + V8 + V6 + V3
F2 =~ NA*V2 + V1 + V6 + V5 + V3
F3 =~ NA*V5 + V4 + V6

M =~ NA*F1 + F2 + F3
M ~~ 1*M
F1 ~~ 1*F1
F2 ~~ 1*F2
F3 ~~ 1*F3

V5 ~~ a*V5
a > 0.001
"

Below is how I specified for unit loading identification.
model <- "F1 =~ 1*V7 + V8 + V6 + V3
F2 =~ 1*V2 + V1 + V6 + V5 + V3
F3 =~ 1*V5 + V4 + V6
M =~ 1*F1 + F2 + F3

V5 ~~ a*V5
a > 0.001
"
And the result I obtained is as below, where the first one is unit variance identification. Did I specify unit variance specification wrongly or such a difference is neglectable?

$modelfit
      chisq df      p_chisq      AIC       CFI       SRMR
df 189.4469 13 1.991134e-33 235.4469 0.9813111 0.03769927

$modelfit
      chisq df      p_chisq      AIC       CFI       SRMR
df 189.4485 13 1.989676e-33 235.4485 0.9813109 0.03769932

Thanks,
Austin

Elliot Tucker-Drob

unread,
Jun 18, 2022, 9:03:26 AM6/18/22
to Austin Park, Genomic SEM Users
That looks to be exactly the same fit from the standpoint of numerical approximation. The optimizer is iterative so it may reach the convergence  criterion at very very slightly different values for different parameterizations of equivalent models. 

Austin Park

unread,
Jun 22, 2022, 9:38:04 PM6/22/22
to Genomic SEM Users
Hello,

Thanks for clarifying!

I have another question regarding the output of userGWAS.

I ran a common factor model (specifying unit loading identification) using userGWAS, and I could obtain a summary statistic for that factor. Although there were 6,291,076 SNPs being tested (SNPs from sumstat function), I could obtain results for only 393,181 SNPs. For the remaining SNPs, the estimates and p-values were calculated but their SNP, BP, MAF, A1, A2 information is missing. The captured result is as below,

화면 캡처 2022-06-23 103711.png

What could be the reason for this?

Best,
Ausin
Reply all
Reply to author
Forward
0 new messages