Difficulties with --sample-diff command

67 views
Skip to first unread message

Rosemarie Wilton

unread,
Mar 14, 2022, 1:04:54 PM3/14/22
to plink2-users

Hello,
I'm trying to check sample concordance between replicate genotyped samples.  I've tried to used the --sample-diff command in various ways, with input from file, with inputting samples directly,  with samples in vcf file or in plink file format, etc. and I can't seem to get this to work. Usually I get a "sample not found" error.   Thanks for any advice!!

Here's an excerpt from my .fam file, which seems to be ok:
32 HG03767_R 0 0 1 -9
33 NA21108 0 0 2 -9
34 NA19042 0 0 2 -9
35 HG01798 0 0 2 -9
36 HG03598 0 0 2 -9
37 HG03767 0 0 1 -9
38 HG03600 0 0 1 -9

Command I'm using:
--bfile GDA002_IDs_STRANDcorr --sample-diff include-missing ids=HG03767_R HG03767

PLINK v2.00a3LM 64-bit Intel (28 Oct 2020)     www.cog-genomics.org/plink/2.0/
(C) 2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to plink2.log.
Options in effect:
  --bfile GDA002_IDs_STRANDcorr
  --sample-diff include-missing ids=HG03767_R HG03767

Start time: Mon Mar 14 11:44:18 2022
1031745 MiB RAM detected; reserving 515872 MiB for main workspace.
Using up to 80 threads (change this with --threads).
40 samples (6 females, 33 males, 1 ambiguous; 40 founders) loaded from
GDA002_IDs_STRANDcorr.fam.
1903895 variants loaded from GDA002_IDs_STRANDcorr.bim.
Note: No phenotype data present.
Error: --sample-diff sample ID 'HG03767_R' not found.


Christopher Chang

unread,
Mar 14, 2022, 1:28:33 PM3/14/22
to plink2-users
The issue is that you have two-part sample IDs (FID+IID).  Two options:
1. (recommended) Set all FIDs (in the first column of your .fam file) to "0".  Internally, plink2 treats single-part IDs as FID="0", IID=the ID you provided.
2. Provide two-part IDs to --sample-diff.  You'll need to use the id-delim= modifier to specify the delimiter between the two parts.
Reply all
Reply to author
Forward
0 new messages