I keep getting a step 4 error: The Step 4
error: "number
of columns of matrices must match” and I think it’s because the SNP universe in my ancestry-specific
summary data differs across the ancestry groups the I use the Hapmap RSID list provided via your dropbox.
Thanks,
Elvis Akwo, MD, MS, PhD
Research Scientist
Vanderbilt University Medical Center
Division of Nephrology & Hypertension
Vanderbilt Center for Kidney Disease
1161 21st Ave South, S-3119 MCN
Nashville, TN 37232-2372
P. 336-918-6972
Email: elvis...@vumc.org
Hi Elvis,
Thanks for contacting us! Here is the link to the files you requested: https://www.dropbox.com/scl/fo/21xrx1z3iinsp5vl2j6ew/AEp1BavM3ti8_kRyHu-DP5o?rlkey=87h1wml96a6q53q6u8ybeusgy&st=qkywe2y5&dl=0
Please let us know if you need help debugging the issue. It is preferred to have a consistent format of the GWAS summary data files across ancestries. If you are using the online platform, please refer to our tutorial at https://pennprs.gitbook.io/pennprs; if you are using the offline pipeline, please refer to our tutorial at https://github.com/PennPRS/Pipeline/wiki. If you encounter more issues, please feel free to contact us again.
Thanks!
Jin
--
您收到此邮件是因为您订阅了Google群组上的“PennPRS”群组。
要退订此群组并停止接收此群组的电子邮件,请发送电子邮件到pennprs+u...@googlegroups.com。
如需查看此讨论,请访问
https://groups.google.com/d/msgid/pennprs/DM3PR12MB9286A5925F8D6EA9821D02A6FA68A%40DM3PR12MB9286.namprd12.prod.outlook.com。
要查看更多选项,请访问https://groups.google.com/d/optout。
|
You don't often get email from jin...@pennmedicine.upenn.edu.
Learn why this is important
|
Hi Elvis,
I think the issue is not due to the format of the input GWAS summary data – in fact, they can have different numbers of SNPs across different ancestry groups. I just tested the online multi-ancestry training pipeline using public GWAS summary data from the GWAS catalog and it successfully ran.
I noticed that the estimated heritability is 0.0021. The small heritability may lead to some issue of the original PROSPER algorithm. The small heritability might indicate a minimal power of the PRS model for the trait. I could provide some suggestions on this analysis if you can provide more details of the application.
Thanks,
Jin
Hi Elvis,
Thanks for the information. Could you give me a rough range for the GWAS sample sizes across these ancestry groups? Multi-ancestry methods may run into issues when GWAS sample sizes differ a lot between ancestry groups.
And about your question: “Also, I wasn't sure I understood correctly. Could you clarify whether it is a feature of the PROSPER algorithm that, if the trait's heritability is low (as in this case), we might get an error about the number of columns in the matrices? And if that is the case, would you suggest we run LD score regression to estimate the trait's SNP heritability (h^2) in each ancestry, and then proceed with PRS training only for traits with heritability estimates above a certain threshold, say >=0.05 in each ancestry?”
Technically, a low heritability does not necessarily lead to an error from PROSPER. But in some cases, either single-ancesry analysis step using lassosum2 (an intermediate step of conducting single-ancestry PRS modeling on each ancestry in PROSPER) or the multi-ancestry analysis step in PROSPER could generate in a PRS model with a minimal power and lead to some errors. This is likely an issue of the original PROSPER algorithm, not the PennPRS pipeline.
If it’s possible for you to share the GWAS summary data with me, I’m happy to take a closer look at the issue and give you some suggestions on alternative solutions.
Thanks,
Jin