The possibility of analyze in the subset of data

77 views
Skip to first unread message

Tyto Lin

unread,
Aug 21, 2018, 4:31:45 AM8/21/18
to plink2-users
Is it possiple to analyze the subset of data in PLINK9 ?

I have 40 individuals in one VCF file, and there are four groups of individuals in my vcf file.Two groups are in the same location but in different environmental condition, the other two are the population in another location and also in different environmental condition.
 
location1    location2 
condition1   group1    group3
condition2   group2    group4
 
 and I have already done --assoc with commands below


vcftools --gzvcf all_HC_SV_SNP.vcf.gz --plink --chrom-map scaffold_chrom-map.txt --out all_HC_SV_SNP&


plink --file all_HC_SV_SNP --aec --allow-no-sex --make-bed --noweb --out all_HC_SV_SNP.binary &



plink --bfile all_HC_SV_SNP.binary --make-pheno condition.pheno "condition1" --aec --assoc --reference-allele babblers_alt_alleles --allow-no-sex --adjust --noweb --out babblers_all_low &



The out put shows the association of condition1 by 

calculating the Chi-square between (group1,group3) and (group2,group4)


but I also want to analyze the association SNP site of condition1 in location1(group1,2) and location2(group3,4) separately without generate the vcf files of two location.


Is there any way to indicate PLINK 1.9 to analysis a subset of data in a data set?


I have tried


plink --bfile all_HC_SV_SNP.binary --within location.txt "location1" --make-pheno condition.pheno "condition1" --aec --assoc --reference-allele babblers_alt_alleles --allow-no-sex --adjust --noweb --out location1_condition1 &

           or

plink --bfile all_HC_SV_SNP.binary --keep location1.txt --make-pheno condition.pheno "condition1" --aec --assoc --reference-allele babblers_alt_alleles --allow-no-sex --adjust --noweb --out babblers_east_high &

 But these two command failed to analyze the subset of data.

one shows Error: Invalid --within parameter sequence. 
 
and the other shows Error: Line 1 of --keep file has fewer tokens than expected. 

Thanks for help
 



Christopher Chang

unread,
Aug 21, 2018, 11:28:19 AM8/21/18
to plink2-users
You are using both --within and --keep incorrectly.

Correct usage of --within to analyze just one location at a time, assuming location.txt is formatted correctly: "--within location.txt --keep-cluster-names location1".
Correct usage of --keep in PLINK 1.9: location1.txt must have TWO columns, one for the family IDs and one for the individual IDs.  (The --make-pheno file must also have two ID columns.)

Tyto Lin

unread,
Aug 22, 2018, 4:53:13 AM8/22/18
to plink2-users

Thanks for indicating incorrectly using.
Reply all
Reply to author
Forward
0 new messages