Now that public testing has started, I'm distinguishing between stable
and development builds. All unfinished flags should be disabled in the
stable build, and not mentioned in command-line help. Except for
bugfixes, the stable build will usually only be updated in large jumps.
Here are some of the features that have been added to WDIST over the last five months:
-
Association analysis max(T) permutation tests, for both case/control
and quantitative traits. If you were using a less accurate approach
because PLINK's permutation tests were too slow, that should no longer
be necessary: our implementation is often over a thousand times faster.
-
Very fast Fisher's exact tests. This incorporates a genuine
algorithmic advance that is not yet present in other software as of this
writing; see
https://www.cog-genomics.org/software/stats for reference
code and an in-browser demo. Chi-square approximations, which can
perform poorly on low-MAF markers, are no longer necessary. Our fast
Hardy-Weinberg exact test uses the same idea.
- I/O speed
improvements. When you're routinely producing multi-gigabyte text
files, stuff like multithreaded gzip and custom number-to-string
conversion routines actually pay off.
- Windows support.
- Proper support for 4GB+ files, even in 32-bit builds.
- Conversion to and back from 23andMe format. (This is mostly just around to make life easier for our GWAS volunteers, but hopefully it comes in handy elsewhere too.)
I will try to get most or all of the following major features into the next stable build:
- Run-of-homozygosity analysis (--homozyg).
-
Hierarchical clustering and multidimensional scaling analysis
(--cluster, --mds-plot). With these done, volunteers will be able to
start walking through Razib's introduction to PLINK (
http://blogs.discovermagazine.com/gnxp/2013/01/using-your-23andme-data-in-plink
) without waiting hours between commands.
- Logistic regression and CNV analysis, because we need it for our own research.