Dear Chris:
I previously tried GCTA for LMM and contacted Yang Jian. Unfornately, GCTA failed to run LMM on UK Biobank data, because it failed to calculate GRM on such a big dataset.
Regarding the REF/ALT issue, I did notice that PLINK2 output the following columns "REF ALT PROVISIONAL_REF A1 OMITTED".
However, no matter it is called REF or PROVISIONAL_REF, my understanding is that it is still based on a binary thinking.
In the futuer, all 4 possible A/C/G/T could happen at all base pairs. Therefore, it should be quaternary.
You mentioned that there are preexisting Cox and Lasso. Can you please let me know your recommended software?
BTW, my UK biobank data in pfile format takes 2.7TB on the server, which costs me some money to store it.
Is there a way to further compress the data?
Thank you very much!
Best regards,
jie