Blat and in silico PCR Tools

30 views
Skip to first unread message

Qasim Ayub

unread,
Feb 11, 2014, 10:08:42 AM2/11/14
to gen...@soe.ucsc.edu

Hi

 

While using the in silico PCR or Blat Tool I use the GRCh37/hg19 human reference assembly. Is there an option of selecting the assembly used by the 1000 Genomes Project to avoid issues with CNV and segmental duplications?

 

ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz?

 

With best wishes

Qasim
---
Qasim Ayub
Team 19: Human Evolution
Room Number N3-12
Morgan Building
The Wellcome Trust Sanger Institute
Wellcome Trust Genome Campus
Hinxton, Cambridgeshire CB10 1SA
United Kingdom

E-mail: q...@sanger.ac.uk
Tel:      +44-(0)-1223-834244 Ext:7353

Fax:     +44-(0)-1223-494919

 

http://www.sanger.ac.uk/research/projects/humanevolution/

 


-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

Brian Lee

unread,
Feb 11, 2014, 3:30:01 PM2/11/14
to Qasim Ayub, gen...@soe.ucsc.edu

Dear Qasim,

Thank you for using the UCSC Genome Browser and your question about an available phase2 1000 Genomes Project assembly for blat searches.

The most direct approach would be to download our blat utility and run it on the command line against the downloaded unzipped phase2 assembly. Please see this FAQ and related mailing list responses about using blat:
http://genome.ucsc.edu/FAQ/FAQblat#blat3
http://genome.ucsc.edu/goldenPath/help/blatSpec.html
https://groups.google.com/a/soe.ucsc.edu/forum/#!searchin/genome/standalone$20blat

You can obtain the appropriate compiled version of blat and related utilities here: http://hgdownload.soe.ucsc.edu/admin/exe/

Here is an example usage and output against hs37d5.fa as the assembly and the exampleSearch.fa as a tail of the assembly, resulting in two matches on chromosome 7 and sequence data hs37d5:
$echo ">ex search from last five lines of assembly" >> exampleSearch.fa ; tail -n 5 hs37d5.fa >> exampleSearch.fa 
$blat hs37d5.fa exampleSearch.fa output.psl 
$cat output.psl


match    mis-     rep.     N's    Q gap    Q gap    T gap    T gap    strand    Q            Q       Q        Q      T            T       T        T      block    blockSizes     qStarts     tStarts
         match    match           count    bases    count    bases              name         size    start    end    name         size    start    end    count
---------------------------------------------------------------------------------------------------------------------------------------------------------------
243    0    0    0    0    0    0    0    +    ex    243    0    243    hs37d5    35477943    35477700    35477943    1    243,    0,    35477700,
32    2    0    0    0    0    0    0    +    ex    243    195    229    7    159138663    99405792    99405826    1    34,    195,    99405792,

Again please review the above resources and related mailing list archives about running blat as a standalone. You may want to convert the .fa to a .2bit file with the faToTwoBit utility.

A much more complicated approach might be to create an assembly hub and setup a blat server. You can read more at the below wiki and related mailing list archives:
http://genomewiki.ucsc.edu/index.php/Assembly_Hubs
https://groups.google.com/a/soe.ucsc.edu/forum/#!searchin/genome/%22assembly$20hub%22
https://groups.google.com/a/soe.ucsc.edu/forum/#!searchin/genome/%22blat$20server%22$20setup

Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead togenom...@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genome Bioinformatics Group



--
 

Reply all
Reply to author
Forward
0 new messages