Individuals missing after PLINK1.9 merge

30 views
Skip to first unread message

2181...@zju.edu.cn

unread,
Jul 14, 2021, 9:01:06 AM7/14/21
to plink2-users
Dear developers,
  I am currently working with four independent datasets with imputation genotypes, and I was trying to merge the SNPs into a single genotypic matrix. However I found a individual was missing after using PLINK 1.9 merge-list command without any relevant information given. Any suggestions for this situation? 

  Here is the log.
############################################
PLINK v1.90b6.9 64-bit (4 Mar 2019)            www.cog-genomics.org/plink/1.9/
(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to MergeNoFlip.log.
Options in effect:
  --merge-list MergeNoflip.list
  --out MergeNoFlip

773821 MB RAM detected; reserving 386910 MB for main workspace.
Allocated 122420 MB successfully, after larger attempt(s) failed.
Warning: Variants '1:62777:A:T' and '1:62777' have the same position.
Warning: Variants '1:95440:C:G' and '1:95440' have the same position.
Warning: Variants '1:173052:C:A' and '1:173052' have the same position.
35128 more same-position warnings: see log file.
Performing single-pass merge (6380 people, 6261375 variants).
Merged fileset written to MergeNoFlip.bed + MergeNoFlip.bim + MergeNoFlip.fam .
############################################

The four datasets included in merge list contain totally 6,381 samples. But only 6,380 samples left after merging.

  Appreciate any help.

Christopher Chang

unread,
Jul 14, 2021, 9:42:43 AM7/14/21
to plink2-users
Have you double- and triple-checked that there wasn't a single duplicate sample ID between the four datasets?  If so, can you post or send me a set of files (ok to reduce the number of variants to 1 per dataset) that let me replicate what you're seeing?

2181...@zju.edu.cn

unread,
Jul 14, 2021, 11:42:01 PM7/14/21
to plink2-users
Thank you for replying soon. 
I check the sample ID in my datasets and indeed there is a individual have duplicated ID, seems like PLINK directly delect one of the sample in a duplicated set? I check their relationship score and confirm they are not in genetically same. So I suppose the right operation is differentiate their name and merge again?

Thanks again.
Reply all
Reply to author
Forward
0 new messages