Hi all. I´m tryibng to retrieve the 3'UTR coordinates for all human transcripts (hg19). I followed these steps:
Select specie and assembly; Group: Genes and Gene predictions; Track: GENCODE Genes V19; Output Format: BED; Output file: myutr.bed
-> Get Output
-> Create one BED record per: 3' UTR Exons
-> Get BED
but instead, the output BED file contains one record per exon even if they are annotated as 3'UTRs, as in this example:
chr1 11868 12227 ENST00000456328.2_utr3_0_0_chr1_11869_f 0 +
chr1 12612 12721 ENST00000456328.2_utr3_1_0_chr1_12613_f 0 +
chr1 13220 14409 ENST00000456328.2_utr3_2_0_chr1_13221_f 0 +,
Is there a way to get the Ensembl Transcript ID, 3'UTR start and 3'UTR from the Table Browser? Thank you in advance,
Danny
Hi Danny,
Thank you for your question about obtaining the Ensembl Transcript ID, 3'UTR start
and 3'UTR end from the Table Browser. Your Table Browser query includes non-coding
genes in the output, and since non-coding genes are by default untranslated, all exons
of non-coding genes will be returned by your query, in addition to normal 3' UTRs.
If you instead filter your results to exclude all gene types except coding, you will
retrieve the 3' UTR positions of protein coding genes.
Here is an example session that illustrates a Table Browser query returning all 3' UTR
exons vs coding-only 3' UTR exons:
http://genome.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=chmalee&hgS_otherUserSessionName=hg19CodingVsNonCodingUTRs
Notice how the top track, which contains all 3' UTR exons, has an item corresponding to
not only the 3' UTR exons of the SOD1 gene, but also for every exon of the green
non-coding genes. In contrast, the bottom track contains a filtered Table Browser
query to include only coding genes, which corresponds to exactly the 3' UTR exons
of the SOD1 gene.
To obtain the 3' UTR positions of only coding genes, follow the below steps:
1. Navigate to the Table Browser (http://genome.ucsc.edu/cgi-bin/hgTables)
2. Select the hg19 assembly and the Genes and Gene Predictions group
3. Select the GENCODE Genes V19 track and choose the Basic table (should be the default)
4. If you have a particular region you are interested in, select that region using
the define regions box, otherwise choose "genome"
5. Click the button "create" next to "filter"
6. Allow filtering from the linked table wgEncodeGencodeAttrsV19
7. In the transcriptClass field under the "hg19.wgEncodeGencodeAttrsV19 based filters" section,
enter "coding" into the text box, so "transcriptClass does match coding", then click "submit"
8. Under output format choose "BED - browser extensible data", enter a name for your file, and
click "get output"
9. On the "Output wgEncodeGencodeBasicV19 as BED" page, choose "3' UTR Exons", and click "get BED"
to download your file
Thank you again for your inquiry and using the UCSC Genome Browser. If you have any
further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address
are archived on a publicly-accessible forum. If your question includes sensitive data, you
may send it instead to genom...@soe.ucsc.edu.
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
Hi Danny
Hope UCSC don't mind me answering this one from Ensembl. Might
have been better directed to us at help...@ensembl.org. UCSC may
also have a Table Browser answer for you.
The APIs are matched to the databases, so to get the e75 database you'll need the e75 API. You can clone these from our Github:
The API modules are found in the sections ensembl, ensembl-variation, ensembl-compara and ensembl-funcgen. When you go into them you'll see the Branch on the left above the file list – choose release/75 from the list then download.
All the best
Emily
Ensembl Outreach
-- Dr Emily Perry (Pritchard) Ensembl Outreach Project Leader European Bioinformatics Institute (EMBL-EBI) European Molecular Biology Laboratory Wellcome Trust Genome Campus Hinxton Cambridge CB10 1SD UK
--