Hi Ben,
Sorry for the delay in responding to your question.
Part 1:
You can use the Table Browser:
http://genome.ucsc.edu/cgi-bin/hgTables, to retrieve coordinates for exons annotated on rheMac2. First select a gene track/table, then use the "BED - browser extensible data" output format to limit the coordinates to only exons. Keep in mind that our table coordinates are zero-based, half-open coordinates; see these links for more info:
http://genome.ucsc.edu/FAQ/FAQtracks.html#tracks1http://genomewiki.ucsc.edu/index.php/Coordinate_TransformsBED format is described here:
http://genome.ucsc.edu/FAQ/FAQformat.html#format1The Table Browser BED output includes the name of the gene and an exon number in the fourth field.
To help you select the gene track you wish to use, you can read a description of each track by clicking on the blue track names on the main Genome Browser page (
http://genome.ucsc.edu/cgi-bin/hgTracks). The tracks in the Genes and Gene Prediction Tracks group vary by source of the data, frequency of updates, and level of manual annotation. The "Other RefSeq" and "TransMap..." tracks show sequence alignments from other species to the rheMac2 genome.
There is not a track in the rheMac2 genome browser that explicitly annotates conserved regions on rhesus as there is on some of our other assemblies such as human and mouse, but you may want to at least look at the Chain/Net tracks in the Comparative Genomics track group to help determine whether a region is conserved.
Part 2:
Check out the Human Chain/Net track on rheMac2. Here is a link to the track description:
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=rheMac2&g=chainNetHg19.
Downloads for the Chain/Net tracks are available by going to our downloads server:
http://hgdownload.cse.ucsc.edu/downloads.html and clicking "Rhesus", then scrolling to "Pairwise Alignments". The rhesus/human downloads are here:
http://hgdownload.cse.ucsc.edu/goldenPath/rheMac2/vsHg19/. In addition to the track description, you may find this page helpful in understanding chains and nets:
http://genomewiki.ucsc.edu/index.php/Chains_Nets.
I hope this information is helpful. If you have further questions, please reply to
gen...@soe.ucsc.edu.
--
Brooke Rhead
UCSC Genome Bioinformatics Group
On 12/21/12 9:10 AM, Ben Evans wrote:
Hello,
I have two questions:
(1) I am interested in filtering a vcf file to include (or not include)
data near genes and conserved regions. Can someone please tell me a
simple way to download the RheMac2 coordinates for all exons and
conserved regions? Basically I need information like this:
GeneExonchromosomestartstop
Gene1exon112020220602
Gene1exon212120221602
Gene2exon112420224602
etc.
(2) I'd like to access the alignment information for rheMac2 and humans
(or another primate). I'd like the following information:
RheMac2_chromosome
RheMac2_positionRheMac2_sequenceHuman_hg19_chromosomeHuman_hg19_positionHuman_hg19_sequence
11A11A
12T12G
etc.
Can someone please provide me with some simple directions to download or
generate this information?
Thanks,
Ben
Ben Evans
Biology Department
McMaster University
Life Sciences Building room 328
1280 Main Street West
Hamilton, Ontario L8S4K1
Canada
phone (office/lab) : 905-525-9140 x 26973/27261
fax: 905-522-6066
Lab Website: http://www.biology.mcmaster.ca/faculty/evans/EvansLab/
--