Hi Haky,
I was wondering if I can follow up on with regards to how to interpret my results from the output I get from S-PrediXcan after using my gwas sum.statistics data .
There are different components I would like to understand for interpreting results. I'm hoping that you or someone in your group can guide me through this.
1. On the number of genes available in the predictDB.
I was expecting to see many more gene hits on some tissues after I ran S-PrediXcan. When I didn’t see some genes, I explored the list of genes available in the models stored in predictDB, and I noticed that for Brain tissues for example : amygdala, has 2369 genes, and hippocampus has 2824 genes. So if I expect were to see some genes that are not within those lists, it won’t show in my results correct? Is this because the only genes included in PrediXcan are those that would have better predictive performance? Are these are the only ones that can be found from GTEx ?
2. On the extract I’m attaching This is part of the result from a MDD gwas summary statistics in amygdala tissue.
Of these three genes, lets say: HLA-C seems to have significant risk association with MDD (P-val 0017), although the predicted performance p-val is also significant (4.41E-08), there were only 2 SNPs out of 24 that were found in the model. Would this be reliable or fair way to interpret this gene for example? In contrast, MMP15 which had no significant risk association with MDD (P-val 0.4089), it had a lot more SNPS found in the model ( 52 out of 53).
I’m curious how to look at these opposite trends or if this is reasonable way to interpret these particular genes.
3. What is “var_g” and ‘effect_size’ in the output columns in S-PrediXcan and how are these interpreted?
Thank you Haky, Any suggestions or comments will be appreciated.
Best,
Juan