Fetching rRNA and tRNA data from Table browser

948 views
Skip to first unread message

Koustav Pal

unread,
Feb 19, 2014, 7:30:35 AM2/19/14
to gen...@soe.ucsc.edu
Hi,
So I wanted to get tRNA and rRNA data in gtf format, but that class is only represented it seems in the rmsk table from which i downloaded the data using the query builder. But, on reading the description of the table, I understand that the rmsk table is a repeat masking table cataloging repeats from repeat masker. Please clarify.

And if this is the case, then how can I get tRNA and rRNA sequences in gtf format from the UCSC table browse using one of the reference gene tables?

--
Regards,
Koustav Pal,
Junior Project Fellow
Vinod Scaria Labs,
Open Source Drug Discovery Project,
CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB),
New Delhi, India.

Matthew Speir

unread,
Feb 19, 2014, 2:05:28 PM2/19/14
to Koustav Pal, gen...@soe.ucsc.edu
Hi Koustav,

Thank you for your question about obtaining tRNA and rRNA coordinates from the Table Browser. You are correct, the rmsk table contains coordinates for various repeat families. Some of these repeat families are derived from specific RNA families, such as rRNA or tRNA. These differ from the actual rRNA and tRNAs, the coordinates for which are found in different tracks. The first section of the following previously answered mailing list question contains steps to get the rRNA coordinates from the GENCODE v19 track, https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/jSAY8w1JVVo/P6lk4OJzDNEJ.  There is a tRNA Genes track, http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=tRNAs, on hg19 that contains coordinates for tRNA genes. In the Table Browser, you can use the following steps to get the coordinates for tRNA genes, with all of the tRNA pseudogenes filtered out:

1. Select your assembly and tracks

    clade: Mammal
    genome: Human
    assembly: Feb. 2009 (GRCh37/hg19)
    group: Genes and Gene Predictions Tracks
    track: tRNA
    table: tRNAs
    output: GTF - gene transfer format
    output file: enter a file name to save your results to a file, or leave blank to display results in the browser

2. Click 'Filter'.

3. Enter 'Pseudo' into the aa field.
    The "aa" line should read: aa doesn't match Pseudo

4. Click 'Submit'.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group
--
 

Reply all
Reply to author
Forward
0 new messages