Finding Neff with only P-value, Odds Ratio, and individual frequencies in a meta-analysis

82 views
Skip to first unread message

Royce Clifford

unread,
Feb 14, 2025, 1:20:05 PMFeb 14
to Genomic SEM Users
Hi!
I have a meta-analysis with four cohorts, have the total cases and controls for each cohort as well as MAF's for each cohort. 
For the meta-analysis I have an odds ratio and P value, but no Z and no standard error.
My formula for overall Z is qnorm(skula$P/2, lower.tail = FALSE).
My formula for standard error is log(OR)/z
When I try to calculate the cohort Neffs, I get negative numbers. What am I doing wrong? Here's a little sample of the data:

rsid................     ICE_freq     UK_freq    US_freq        Odds Ratio   P-value
rs113196488 ......         .1223              0.1061         0.1077                0.95075             .000100

Is it possible to derive Neff for each cohort from this data, or should I give this up and just derive it from the totals for all the cohorts?
Thanks!
Royce    

2

141324114


Elliot Tucker-Drob

unread,
Feb 18, 2025, 12:27:59 PMFeb 18
to Royce Clifford, Genomic SEM Users
and the supplement to https://doi.org/10.1016/j.biopsych.2022.05.029 We show that you can estimate Neff from the logistic regression SE as 4/(2pq x logistic_SE^2).

Looks like you have the p value and Odds ratio. So in this case, I would 1) use qnorm to determine the Z statistic corresponding to the p value (taking care to obtain the p value for the two-sided test), 2) take the log of the odds ratio to get the logistic beta, and 3) solve for the SE given that Z=B/SE.




--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/f93d7467-7cf5-4d8c-99fb-06df6ec5a2e4n%40googlegroups.com.

Royce Clifford

unread,
Feb 18, 2025, 9:17:41 PMFeb 18
to Genomic SEM Users
Thanks for this. I calculated 
Z <- qnorm(P/2, lower.tail = TRUE)
beta <- log(OR)
SE <- beta/Z

Neff for each cohort <- 4/(2pq*SE^2)

Most of the values were reasonable - a few were infinite, some were not a number (NaN) and some were crazy, like 380 million. I guess that's why there's a cap.
Thanks for your help!
Royce

Royce Clifford

unread,
Mar 1, 2025, 2:36:34 PMMar 1
to Genomic SEM Users
For my LDSC calculations, I'm getting crazy Total h2 liabilities for individual cohorts, crazy Total Observed Scale Genetic Covariance (g_cov), Total Observed Scale h2 of >>> 10000, and crazy Total liability scale genetic covariance between studies. I'm reasonably sure it is because of Neff, but I can't figure out what I'm doing wrong. Here are the stats for one of the studies:
cohorts          cases          controls            v             EffN
ICELAND        30802         278502           .10       110938
UKB                   9715         421332           .02         37984
US                      1888           24961           .07           7021
FINLAND           5667        169746           .03         21935

I had allele frequencies for each study, so I calculated Neff for ICELAND, for instance, as 4/((2 *  $ICE_freq * (1 -  ICE_freq)) *  se^2) and capped it recommended 1.1 and .5 EffNs, then summed the Neff's for the LDSC part. For standard error, I used the standard error of beta, which should be the log_se.
Correlations between the studies look very reasonable and Lambda GC and intercepts look great (maybe too good?). I just can't figure out what I'm doing wrong. Any input you might have would be very helpful.
Thanks!
Royce
MVP_bal.sumstats.gz_skula_bal.sumstats.gz_UKB_bal.sumstats.gz_meta_bal.sumstats.gz_ldsc.log
Reply all
Reply to author
Forward
0 new messages