writing code for haploid QT assoc

38 views
Skip to first unread message

William Gilks

unread,
Jul 16, 2015, 8:50:04 PM7/16/15
to plink...@googlegroups.com
Hi,

I'd like to test genotypes against a quantitative trait for a large haploid genome, whereby all genotypes are either '0' or '1' .

The simplest statistical test for this, i.e. comparing the average trait value for '0' vs '1' - is a t-test. (or other two-sample test).

I can do this in R but it takes a long time with a million SNPs. So I was thinking that modification of PLINK code would be effective, considering this is such a simple test.

I don't have any experience in C/C++ but would happily give it a try if someone could point me in the right direction.

Sincerely,

William Gilks

Christopher Chang

unread,
Jul 17, 2015, 6:35:06 PM7/17/15
to plink...@googlegroups.com, wpg...@gmail.com
What other programming languages are you familiar with?

I ask because,

(i) if you have *no* previous experience with C/C++, it will probably take more than a month for you to become familiar enough with the language to effectively edit the PLINK 1.07 or 1.9 source code.  (1.07 requires moderate familiarity with both C and C++, while 1.9 doesn't require C++ but does require additional experience with low-level C.)
(ii) this sounds like a job that is doable with a more scientist-friendly language like Python.

If you have PLINK-formatted data, I can describe the relevant file formats in enough detail for you to handle them efficiently in your language of choice; almost anything will offer a speedup over R.
Reply all
Reply to author
Forward
0 new messages