Dear Author,
I have generated VCF file for lobSTR and use vcftools to get STR frequency (see below):
CHROM POS N_ALLELES N_CHR {ALLELE:FREQ}
chr14 16035037 3 130 TTTTCTTTTCTTTTCTTTTTTTTTTTTCTTTTCTTTCTTTTCTTTTCTTT:0.938462 TTTTCTTTTCTTTTCTTTTTTTTTTTTCTTTTCTTTCTTTTCTTTTCTTTT:0.0461538 TTTTCTTTTCTTTTCTTTTTTTTTTTTCTTTTCTTTCTTTTCTTTTCTTTTT:0.0153846
chr14 16036677 2 2 TTTCTTTTCTTTTCTTTTCTTCTTTTCTCTCTTCTTTTTTCTTCTCTCTTCTTCCTTCCTTTCTTTCTTCTTTCTTTCTCTCACTCTCTTTCTTTCTTTCTCTCTTTCTTTCTTTCTCTTCTTTCTTTCTTTCCCTTTCTTTCTTTCTTTCTTTATTTTTTTCTTTCTTTC:0 TTTCTTTTCTTTTCTTTTCTTCTTTTCTCTCTTCTTTTTTCTTCTCTCTTCTTCCTTCCTTTCTTTCTTCTTTCTTTCTCTCACTCTCTTTCTTTCTTTCTCTCTTTCTTTCTTTCTCTTCTTTCTTTCTTTCCCTTTCTTTCTTTCTTTCTTTATTTTTTTCTTTC:1
chr14 16037656 6 212 TTCTTTCTTTCTTCTTTCTTTCTTTCTTTCTTTCTTTCTTTC:0.34434 TTCTTTCTTTCTTCTTTCTTTCTTTC:0.45283 TTCTTTCTTTCTTCTTTCTTTCTTTCTTTC:0.0471698 TTCTTTCTTTCTTCTTTCTTTCTTTCTTTCTTTCTTTC:0.0660377 TTCTTTCTTTCTTCTTTCTTTCTTTCTTTCTTTCTTTCTTTCCTTT:0.0801887 TTCTTTCTTTCTTCTTTCTTTCTTTCTTTCTTTCTTTCTTTCCTTTCTTT:0.00943396
chr14 16037717 6 336 ATTGATTGATTGATTGATTGATTGATT:0.75 ATTGATTGATTGATTGATTGATT:0.0595238 ATTGATTGATTGATTGATTGATTGATTGAT:0.00297619 ATTGATTGATTGATTGATTGATTGATTGATT:0.125 ATTGATTGATTGATTGATTGATTGATTGATTGATT:0.0505952 ATTGATTGATTGATTGATTGATTGATTGATTGATTGATT:0.0119048
...
I see that I can use this number in your equation in a paper. But just wonder if I could use your "analyze_heterozygosity_new.py" in the str_catalog_supplemental_scripts directory to get heterozygosity per locus. How to make it run? Do you have other scripts for heterozygosity? I like to get the most polymorphic loci.
Thank you in advance for a nice tool.
James