Hi
While using the in silico PCR or Blat Tool I use the GRCh37/hg19 human reference assembly. Is there an option of selecting the assembly used by the 1000 Genomes Project to avoid issues with CNV and segmental duplications?
With best wishes
Qasim
---
Qasim Ayub
Team 19: Human Evolution
Room Number N3-12
Morgan Building
The Wellcome Trust Sanger Institute
Wellcome Trust Genome Campus
Hinxton, Cambridgeshire CB10 1SA
United Kingdom
E-mail: q...@sanger.ac.uk
Tel: +44-(0)-1223-834244 Ext:7353
Fax: +44-(0)-1223-494919
http://www.sanger.ac.uk/research/projects/humanevolution/
Dear Qasim,
Thank you for using the UCSC Genome Browser and your question about an available phase2 1000 Genomes Project assembly for blat searches.
The most direct approach would be to download our blat utility and run it on the command line against the downloaded unzipped phase2 assembly. Please see this FAQ and related mailing list responses about using blat:
http://genome.ucsc.edu/FAQ/FAQblat#blat3
http://genome.ucsc.edu/goldenPath/help/blatSpec.html
https://groups.google.com/a/soe.ucsc.edu/forum/#!searchin/genome/standalone$20blat
You can obtain the appropriate compiled version of blat and related utilities here: http://hgdownload.soe.ucsc.edu/admin/exe/
Here is an example usage and output against hs37d5.fa as the assembly and the exampleSearch.fa as a tail of the assembly, resulting in two matches on chromosome 7 and sequence data hs37d5:
$echo ">ex search from last five lines of assembly" >> exampleSearch.fa ; tail -n 5 hs37d5.fa >> exampleSearch.fa
$blat hs37d5.fa exampleSearch.fa output.psl
$cat output.psl
match mis- rep. N's Q gap Q gap T gap T gap strand Q Q Q Q T T T T block blockSizes qStarts tStarts
match match count bases count bases name size start end name size start end count
---------------------------------------------------------------------------------------------------------------------------------------------------------------
243 0 0 0 0 0 0 0 + ex 243 0 243 hs37d5 35477943 35477700 35477943 1 243, 0, 35477700,
32 2 0 0 0 0 0 0 + ex 243 195 229 7 159138663 99405792 99405826 1 34, 195, 99405792,
Again please review the above resources and related mailing list archives about running blat as a standalone. You may want to convert the .fa to a .2bit file with the faToTwoBit utility.
A much more complicated approach might be to create an assembly hub and setup a blat server. You can read more at the below wiki and related mailing list archives:
http://genomewiki.ucsc.edu/index.php/Assembly_Hubs
https://groups.google.com/a/soe.ucsc.edu/forum/#!searchin/genome/%22assembly$20hub%22
https://groups.google.com/a/soe.ucsc.edu/forum/#!searchin/genome/%22blat$20server%22$20setup
Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead togenom...@soe.ucsc.edu.
All the best,
Brian Lee
UCSC Genome Bioinformatics Group
--