bug with --fast-epistasis

169 views
Skip to first unread message

alinoe lavillaureix

unread,
Oct 12, 2015, 9:21:02 PM10/12/15
to plink2-users

Hello!

I use --fast-epistasis case-only set-by-all for my analysis, and then compare the results with another test (not using PLINK but a R script).

 

I found important differences between the 2 tests results (some results from PLINK appearing as false-positives), so after some troubleshooting I found that the differences involve only SNP1xSNP2 when the SNP2 is on CHR10 to CHR22 (no problem with the CHR1 to 9).

 

Then I compared results from PLINK1 –-fast-epistasis with PLINK2 –-fast-epistasis no-ueki (supposed to give exactly the same results as PLINK1) and found that the results are the same for SNP1xSNP2 combinations with SNP2 located on CHR1 to 9, but again very different if the SNP2 is on CHR10 to 22!

In addition I calculated the Chi-square stat for some combinations of SNPs and found that PLINK1 gave the right result, and when compared with the other test I’m using, the results are now very well correlated with the PLINK1 –fast-epistasis –set-by-all results. You will find figures attached.

I would like to be able to use the PLINK2 results because it's really faster and seems a bit more powerful than in PLINK1.

 

Did you already know about this problem? Can you fix it?


Thank you!

plot_setbyall_plink1_vs_setbyall_plink2_no_ueki_allres_4109ASD_merged_nonSS_al_QCMAFset2+_oct_col_chr10_22.jpeg
plot_setbyall_plink1_vs_setbyall_plink2_no_ueki_allres_4109ASD_merged_nonSS_al_QCMAFset2+_oct_col_chr1_9.jpeg
plot_setbyall_plink1_vs_setbyall_plink2_no_ueki_allres_4109ASD_merged_nonSS_al_QCMAFset2+_oct.jpeg

Christopher Chang

unread,
Oct 13, 2015, 1:34:41 AM10/13/15
to plink2-users
The last --fast-epistasis bugfix was on March 5th.  If your build is older than that, try rerunning with the latest build.

If the bug is still present, can you send me a dataset I can replicate it with?  Thanks!

Christopher Chang

unread,
Oct 13, 2015, 1:35:49 AM10/13/15
to plink2-users
(also, if the bug is still present, send or post the .log file from your run)

alinoe lavillaureix

unread,
Oct 13, 2015, 3:53:16 PM10/13/15
to plink2-users
Hello,
Thank you for your fast answer!
 I just tested with the PLINK2 new version (PLINK v1.90b3w 64-bit (3 Sep 2015) ) because I was using an older version (PLINK v1.90b2n 64-bit (2 Nov 2014)) , but unfortunatly I find the same results with the pb for combinations of SNPs with the 2nd CHR 10 to 22.
My dataset is pretty large, do you want a subset? Do you have a securized way to send it?

Here is the log file:
PLINK v1.90b3w 64-bit (3 Sep 2015)         https://www.cog-genomics.org/plink2
(C) 2005-2015 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to /mnt/speed/Alinoe/epi_co_merged_nonSS_al_updated_RasSNPlist/plink2_no_ueki/epi_co_set2_4109ASD_with2parents_topresults_merged_nonSS_al_071615_QCMAFset2+_plink2_beta3w_setbyall_no_ueki.log.
Options in effect:
  --bfile /mnt/speed/Alinoe/merged_nonSS_NEW_CHOP_added/pseudo-controls/pseudo_controls_set2_SNPS_hwe10e-6_difMAFdatasets_updated_MAFinf2p_all_MAFinf1pCauc
  --exclude /mnt/speed/Alinoe/merged_nonSS_NEW_CHOP_added/SNPs_to_remove_for_QC/SNPs_to_remove_hwe10e-6_difMAFdatasets_updated_MAFinf2p_all_MAFinf1pCauc_QCMAFset2+_072915
  --fast-epistasis case-only set-by-all no-ueki
  --gap 2000000000000
  --out /mnt/speed/Alinoe/epi_co_merged_nonSS_al_updated_RasSNPlist/plink2_no_ueki/epi_co_set2_4109ASD_with2parents_topresults_merged_nonSS_al_071615_QCMAFset2+_plink2_beta3w_setbyall_no_ueki
  --set /mnt/speed/Alinoe/merged_nonSS_NEW_CHOP_added/SNPs_Ras_10kb/RAS_20genes_10kb_SNPslist_merged_nonSS_al_072915.set

129150 MB RAM detected; reserving 64575 MB for main workspace.
4480265 variants loaded from .bim file.
8218 people (7034 males, 1184 females) loaded from .fam.
8218 phenotype values loaded from .fam.
--exclude: 4471807 variants remaining.
Using up to 15 threads (change this with --threads).
Before main variant filters, 8218 founders and 0 nonfounders present.
Calculating allele frequencies... 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% 11% 12% 13% 14% 15% 16% 17% 18% 19% 20% 21% 22% 23% 24% 25% 26% 27% 28% 29% 30% 31% 32% 33% 34% 35% 36% 37% 38% 39% 40% 41% 42% 43% 44% 45% 46% 47% 48% 49% 50% 51% 52% 53% 54% 55% 56% 57% 58% 59% 60% 61% 62% 63% 64% 65% 66% 67% 68% 69% 70% 71% 72% 73% 74% 75% 76% 77% 78% 79% 80% 81% 82% 83% 84% 85% 86% 87% 88% 89% 90% 91% 92% 93% 94% 95% 96% 97% 98% 99% done.
Total genotyping rate is 0.968659.
--set: 1 set defined.
4471807 variants and 8218 people pass filters and QC.
Among remaining phenotypes, 4109 are cases and 4109 are controls.
--fast-epistasis: Skipping 110299 monomorphic/non-autosomal sites.
--fast-epistasis no-ueki case-only to
/mnt/speed/Alinoe/epi_co_merged_nonSS_al_updated_RasSNPlist/plink2_no_ueki/epi_co_set2_4109ASD_with2parents_topresults_merged_nonSS_al_071615_QCMAFset2+_plink2_beta3w_setbyall_no_ueki.epi.co
... 0% [processing] writing]             38% [processing] writing]             76% [processing] writing]             done.
10505347209 valid tests performed, summary written to
/mnt/speed/Alinoe/epi_co_merged_nonSS_al_updated_RasSNPlist/plink2_no_ueki/epi_co_set2_4109ASD_with2parents_topresults_merged_nonSS_al_071615_QCMAFset2+_plink2_beta3w_setbyall_no_ueki.epi.co.summary
.

Thanks!

Christopher Chang

unread,
Oct 13, 2015, 4:42:22 PM10/13/15
to plink2-users
If the bug still appears when you take a subset of the data, then yes, it would be better to just send me that subset.  (It looks like the bug only shows up when the dataset is too large to process in a single pass, so you'll probably need to add the --memory flag, which limits the amount of memory available to PLINK, to cause the bug to still appear.)

You can use Dropbox to give me access to the dataset.

Zacryan Li

unread,
Nov 27, 2018, 5:12:22 PM11/27/18
to plink2-users
Hello! I have a related question. I tried to run -fast-epistasis between two sets of markers using --set tmp1 option where tmp1 file specified two sets of SNP markers. PLINK1 displays the results of fast-epistatsis analysis between two sets of markers as expected. but PLINK2 (v1.90b3w 64-bit (3 Sep 2015) just did exhaustive pair-wise analysis neglecting the set analysis I specified. Did someone see the same problem?
Reply all
Reply to author
Forward
0 new messages