Extracting codis loci

60 views
Skip to first unread message

Anastassiya Zidkova

unread,
Dec 5, 2016, 5:12:37 AM12/5/16
to lobSTR user group
Hi,
I just started to use lobSTR for STR  extractuion from VCF and I am amazed by your work guys!

I just wanted to post modification to your best practice for codis loci extraction, since I found that your pipe is not giving expected results.

Here is my modified pipe:
intersectBed -a $VCF_FILE -b '~/Software/lobSTR-bin-Linux-x86_64-4.0.6/hg19_v3.0.2/lobSTR_codis_hg19.bed' -wa -wb \
| cut -f 1,2,10,14- \
| sed 's/:/\t/g' \
| cut -f 1,2,3,9,15- \
| sed 's/\//\t/g' \
| awk '{print $0 "\t" ($5+($7*$8))/$7 "\t" ($6+($7*$8))/$7}' 

Columns that I selected for final table are: Chromosome, Position, Genotype, Genotype given in bp difference from reference, Repeat period, Reference allele, Repeat name.
I slightly corrected sed by adding `g` at the end, this command should parse Genotype and Genotype given in bp difference from reference columns
I then computed number of repeats using following formula  (Genotype given in bp difference from reference + (Repeat period + Reference allele))/Repeat period.

Here are results that I get:
chr11 2192318 1 1 11 11 4 7 TH01 9.75 9.75
chr13 82722160 1 0 -8 0 4 11 D13S317 9 11
chr15 97374245 1 2 5 45 5 5 PentaE 6 14




Cheers,
Anastassiya

nageen....@iiu.edu.pk

unread,
Jul 26, 2018, 12:37:50 AM7/26/18
to lobSTR user group
Thanks for posting this command. The resulting table has 11 columns. can you please tell the name of each column accordingly??

Anastassiya Zidkova

unread,
Aug 20, 2018, 8:26:15 AM8/20/18
to lobstr-u...@googlegroups.com
Hello,
column names are following: Chromosome, Position, Genotype x2 (for each
chromosome), Genotype given in bp difference from reference x 2 (for each
chromosome), Repeat period, Reference allele, Repeat name, Number of
repeats x2 (for each chromosome).

Hope it helps,
Anastassiya

--
You received this message because you are subscribed to a topic in the Google Groups "lobSTR user group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lobstr-user-group/otaqCzPJAw0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lobstr-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

tfr...@gmail.com

unread,
May 2, 2019, 1:05:51 PM5/2/19
to lobSTR user group
Hello Anastassiya! I need to extract the genotypes from the autosomal STRs from de 1000 genomes project. Can you help me? I am new in this. Please!!


El lunes, 20 de agosto de 2018, 9:26:15 (UTC-3), Anastassiya Zidkova escribió:
Hello,
column names are following: Chromosome, Position, Genotype x2 (for each
chromosome), Genotype given in bp difference from reference x 2 (for each
chromosome), Repeat period, Reference allele, Repeat name, Number of
repeats x2 (for each chromosome).

Hope it helps,
Anastassiya

To unsubscribe from this group and all its topics, send an email to lobstr-u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages