Finding Neff with only P-value, Odds Ratio, and individual frequencies in a meta-analysis

Royce Clifford

unread,

Feb 14, 2025, 1:20:05 PMFeb 14

to Genomic SEM Users

Hi!

I have a meta-analysis with four cohorts, have the total cases and controls for each cohort as well as MAF's for each cohort.

For the meta-analysis I have an odds ratio and P value, but no Z and no standard error.

My formula for overall Z is qnorm(skula$P/2, lower.tail = FALSE).

My formula for standard error is log(OR)/z

When I try to calculate the cohort Neffs, I get negative numbers. What am I doing wrong? Here's a little sample of the data:

rsid................ ICE_freq UK_freq US_freq Odds Ratio P-value

rs113196488 ...... .1223 0.1061 0.1077 0.95075 .000100

Is it possible to derive Neff for each cohort from this data, or should I give this up and just derive it from the totals for all the cohorts?

Thanks!

Royce

2

141324114

Elliot Tucker-Drob

unread,

Feb 18, 2025, 12:27:59 PMFeb 18

to Royce Clifford, Genomic SEM Users

See the examples in https://github.com/GenomicSEM/GenomicSEM/wiki/2.1-Calculating-Sum-of-Effective-Sample-Size-and-Preparing-GWAS-Summary-Statistics

and the supplement to https://doi.org/10.1016/j.biopsych.2022.05.029 We show that you can estimate Neff from the logistic regression SE as 4/(2pq x logistic_SE^2).

Looks like you have the p value and Odds ratio. So in this case, I would 1) use qnorm to determine the Z statistic corresponding to the p value (taking care to obtain the p value for the two-sided test), 2) take the log of the odds ratio to get the logistic beta, and 3) solve for the SE given that Z=B/SE.

--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/f93d7467-7cf5-4d8c-99fb-06df6ec5a2e4n%40googlegroups.com.

Royce Clifford

unread,

Feb 18, 2025, 9:17:41 PMFeb 18

to Genomic SEM Users

Thanks for this. I calculated

Z <- qnorm(P/2, lower.tail = TRUE)
beta <- log(OR)

SE <- beta/Z

Neff for each cohort <- 4/(2pq*SE^2)

Most of the values were reasonable - a few were infinite, some were not a number (NaN) and some were crazy, like 380 million. I guess that's why there's a cap.

Thanks for your help!

Royce

Royce Clifford

unread,

Mar 1, 2025, 2:36:34 PMMar 1

to Genomic SEM Users

For my LDSC calculations, I'm getting crazy Total h2 liabilities for individual cohorts, crazy Total Observed Scale Genetic Covariance (g_cov), Total Observed Scale h2 of >>> 10000, and crazy Total liability scale genetic covariance between studies. I'm reasonably sure it is because of Neff, but I can't figure out what I'm doing wrong. Here are the stats for one of the studies:

cohorts cases controls v EffN

ICELAND 30802 278502 .10 110938

UKB 9715 421332 .02 37984

US 1888 24961 .07 7021

FINLAND 5667 169746 .03 21935

I had allele frequencies for each study, so I calculated Neff for ICELAND, for instance, as 4/((2 * $ICE_freq * (1 - ICE_freq)) * se^2) and capped it recommended 1.1 and .5 EffNs, then summed the Neff's for the LDSC part. For standard error, I used the standard error of beta, which should be the log_se.

Correlations between the studies look very reasonable and Lambda GC and intercepts look great (maybe too good?). I just can't figure out what I'm doing wrong. Any input you might have would be very helpful.

Thanks!

Royce

MVP_bal.sumstats.gz_skula_bal.sumstats.gz_UKB_bal.sumstats.gz_meta_bal.sumstats.gz_ldsc.log

Reply all

Reply to author

Forward