Question regarding the unit of beta for binary phenotypes

3 views
Skip to first unread message

Chen Lou

unread,
8:05 AM (9 hours ago) 8:05 AM
to Genomic SEM Users
Dear genomicSEM developers,

Thank you very much for your excellent paper on GenomicSEM. I have a question regarding the unit of beta for binary phenotypes.  



In "Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits" (Nature Human Behaviour 2019), the logistic regression coefficient (log-odds ratio) was standardized as follows in GenomicSEM.

The original logistic regression beta (log OR scale) was converted to a standardized SNP effect:

σ2SNP  = 2maf(1-maf) 

v = case/N_total

b_logit, = Z / sqrt( v(1−v) * N_total *σ2SNP)

se(b_logit) = 1 / sqrt( v(1−v) * N_total *  σ2SNP   )

Then the SNP effect was further scaled to the unit-variance liability scale by dividing by the square root of total liability variance:

b_std = b_logit / sqrt( σ2SNP * (b_logit)^2 + π^2/3 )          (Eq. 1)

se(b_std) = se(b_logit) / sqrt(  σ2SNP   * (b_logit)^2 + π^2/3 )




In "Pervasive Downward Bias in Estimates of Liability-Scale Heritability in GWAS Meta-analysis: A Simple Solution"(Biological Psychiatry, The authors are members of the Genomic SEM research team.), b* is defined instead as the linear regression coefficient of a standardized binary phenotype (assuming balanced case–control design, The 0/1 phenotype was standardized to have mean 0 and variance 1. ):

b* = Z / sqrt( 4 v(1−v)*n_total *σ2SNP)           (Eq. 2)

SE(b*) = 1 / sqrt( 4 v(1−v)*n_total* σ2SNP   )

and approximately:

b_logit ≈ 2 b* → b* ≈ 0.5*b_logit

I am not certain whether this  b*   is on the unit-variance liability scale.




I have two questions:

1. Between Eqs. (1) and (2), MTAG seems to standardize the binary phenotype to mean 0 and variance 1 and adopts Eq. (2) (4v(1-v)n is input to MTAG as Neff), whereas Genomic SEM uses Eq. (1). Could you clarify the practical distinction between these two formulations?
2. Conceptually, how do Eqs. (1) and (2) differ? They appear numerically close. In Eq. (2), does b* correspond to the unit-variance liability scale? I have not yet conducted a systematic empirical comparison.

    Many thanks for your clarification and time!

    Best regards,

    Lou Chen
    Wenzhou Medical University

    Elliot Tucker-Drob

    unread,
    9:18 AM (8 hours ago) 9:18 AM
    to Chen Lou, Genomic SEM Users
    In the 2019 NHB paper we explain that it is necessary to convert the logistic regression coefficients to the standardized liability scale (using the (pi^2)/3 term) in order to place them on the same scale as the elements in the genetic covariance matrix, which are also scaled relative to phenotypic variances of 1.0. The sumstats function performs this conversion for use with userGWAS.

    In the 2023 BP paper, we derive the appropriate N for a GWAS meta-analysis of case-control cohorts. We use the relationship between the logistic and linear probability models as an element within our proof. This does not require converting the logistic coefficient to the standardized liability scale and hence we do not use the (pi^2)/3 term.

    Our explication of how Genomic SEM relates to MTAG is in the supplement of the 2019 paper. The notation and formulations are very different even though the models themselves are very close. Therefore, providing a simple answer to how and why two specific terms differ between the two frameworks would be difficult.


    --
    You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
    To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/fb762158-9f0f-45be-92e0-21516e23856cn%40googlegroups.com.
    Reply all
    Reply to author
    Forward
    0 new messages