userGWAS Error in eigen(V_full) : infinite or missing values in 'x'

John Meredith

unread,

Nov 18, 2024, 2:32:18 PM11/18/24

to Genomic SEM Users

Hello,
I am trying to run a multivariate GWAS with 5 traits. The munge, LDSC, and sumstats preparation steps all completed without issue, but I am running into the following error:

Error in eigen(V_full) : infinite or missing values in 'x'
Calls: userGWAS -> .userGWAS_main -> eigen
Execution halted

I am running this GWAS stratified by chromosome, and I'm testing with chromosome 22.
The first time I saw this error, the function completes >3000 models before failing. Based on some other threads, I tried removing SNPs from the sumstats output that may have had problematic standard errors (0s). After doing this, the function ran over 90,000 models before failing again.

I am not sure how to continue troubleshooting what may be happening, or where the issues are coming from. Any help or ideas would be appreciated.

John

Charleen Adams

unread,

Nov 19, 2024, 9:19:13 PM11/19/24

to John Meredith, Genomic SEM Users

Dear John (and Developers),

I can’t answer your question, John, (just a fellow user), but I can say I just got the same error for ch21 when I ran a userGWAS with 39 traits after trying to run it chromosome by chromosome (having gotten the idea from you). Previously, I did not get that error. Does running it one chromosome at a time fly at odds with the covariance matrix? Is there another explanation?

Best,

Charleen

--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/9c08809a-5431-4f19-920c-8358261fc649n%40googlegroups.com.

Michel Nivard

unread,

Nov 20, 2024, 3:54:21 AM11/20/24

to Charleen Adams, John Meredith, Genomic SEM Users

When you both say "ran with one chromosome" what steps of the proces are run with a single chromosome? Estimating S and V with one chromosome isn't what you want to do, you want genomewide S and V, and then run the GWAS step per chromosome.

All the best,

Michel

To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/A88CC58A-F7F3-484D-AC15-61314E216C02%40gmail.com.

John Meredith

unread,

Nov 20, 2024, 10:13:28 AM11/20/24

to Genomic SEM Users

Hi Michel,

Yes, I am estimating S and V with the full genome data, then running the userGWAS function by chromosome.

John Meredith

unread,

Nov 21, 2024, 1:03:40 PM11/21/24

to Genomic SEM Users

Regarding these issues, there are a few questions I had about methods and performance, and I thought I would include more information about my task.

I have 5 traits, the SNP numbers range from about 10 million to 13.8 million across these traits. I've run munge, ldsc, and sumstats preparation with these traits together with no issues. As far as I can tell, the output from LDSC looks fine.

I've tried running a userGWAS with both a single factor and 2-factor model on a Linux HPC cluster. Initially, the single factor timed out after 72 hours with 24 cores. To try and improve runtime performance I tried running separate jobs for each chromosome, where I use the full-genome LDSC results and subset my full-genome sumstats object for the current chromosome as follows:

sumstats_chr <- sumstats[ sumstats $CHR == chr, ]
# Run userGWAS on the chromosome-specific data
gwas_output <- userGWAS(
covstruc = LDSCoutput, # LDSC data loaded here
estimation = "DWLS",
SNPs = sumstats_chr ,
model = model_snp,
parallel = FALSE
)

This method lead to my above errors with infinite or missing values in V_full.

My most recent effort was returning to the full-genome analysis and implementing the Linux performance fix that is noted in the documentation:

export OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 MKL_NUM_THREADS=1 NUMEXPR_NUM_THREADS=1 VECLIB_MAXIMUM_THREADS=1
~/anaconda3/envs/r-env/bin/R --no-echo --no-restore --file=gwas_template_full.R

These have been running overnight - I'm curious what the expected runtime would be for my 5 traits with ~10 million SNPs and 22 cores?

Best,

John

John Meredith

unread,

Dec 4, 2024, 7:17:04 PM12/4/24

to Genomic SEM Users

Does anyone have any insight on what may be causing these problems or have brainstorming ideas for testing? I have tried different chromosomes, sumstats combinations, and models and still get the same problems. I would appreciate any troubleshooting guidance.

Thanks!

Reply all

Reply to author

Forward