dupcheck testpairs option

41 views
Skip to first unread message

Mark

unread,
Apr 22, 2014, 7:11:27 AM4/22/14
to glu-...@googlegroups.com
Hi 

I am trying to use dupcheck with the --testpairs=FILE option. 

I have set up a data file with two samples I know are duplicates, and dupcheck runs fine. When I add the option above, the command still runs, but no longer finds the duplicates. 
I was wondering what format and layout the file containing the pairs needs to have. 
So far I have tried a .txt file with two columns, tab separated, comma separated, with and without headers. 

Thanks
Mark

Kevin Jacobs <jacobs@bioinformed.com>

unread,
Apr 23, 2014, 9:59:33 AM4/23/14
to glu-...@googlegroups.com
Hi Mark,
  
The pair file should be tab delimited with two sample names per line with no headers.  Please let me know if you're still having problems.

-Kevin


--
You received this message because you are subscribed to the Google Groups "glu-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to glu-users+...@googlegroups.com.
To post to this group, send email to glu-...@googlegroups.com.
Visit this group at http://groups.google.com/group/glu-users.
For more options, visit https://groups.google.com/d/optout.

Mark

unread,
Apr 24, 2014, 5:46:43 AM4/24/14
to glu-...@googlegroups.com, jac...@bioinformed.com
Hi 

I did some more tests, and it seems that to use the --testpairs option, the SNP data file must be in sbat or sdat format 
(the user manual says that sbat/sdat is the preferred input for qc.dupcheck, but not the exclusive input).  

So, for example,

glu qc.dupcheck input.X -o dupcheck_out.txt

works fine regardless of the input format (X=sbat/sdat/lbat/ldat), but

glu qc.dupcheck input.X --testpairs.txt -o dupcheck_out.txt 

only works for sdat/sbat formats. 

Sorry if that's already in the help pages somewhere. 
Mark
Reply all
Reply to author
Forward
0 new messages