UCSC Table Browser rRNA tRNA tracks

706 views
Skip to first unread message

Zhang Bingjie

unread,
Sep 1, 2016, 12:10:26 PM9/1/16
to gen...@soe.ucsc.edu
Hi,

Recently, I am working on RNA-seq data. Considering ribosome RNA contamination, I would like to "mask" them using cufflinks with "-M" option.( As the manual says: " Tells Cufflinks to ignore all reads that could have come from transcripts in this GTF file. We recommend including any annotated rRNA, mitochondrial transcripts other abundant transcripts you wish to ignore in your analysis in this file. Due to variable efficiency of mRNA enrichment methods and rRNA depletion kits, masking these transcripts often improves the overall robustness of transcript abundance estimates." ) 

The problem is that when I tried to download the corresponding rRNA/tRNA file(zebrafish. danRer7) from UCSC table browser, I had some confusion. As the post said: "the rmsk table contains coordinates for various repeat families. Some of these repeat families are derived from specific RNA families, such as rRNA or tRNA. These differ from the actual rRNA and tRNAs, the coordinates for which are found in different tracks.  --Matthew Speir). So I think rmsk is not a good option to get rRNA/tRNA files. Then I searched this post , this one seems to be right but there is no GENCODE database for zebrafish. My question is, how can I get right&complete zebrafish rRNA/tRNA GTF file from UCSC?

Thanks for your help in advance!
 

Luvina Guruvadoo

unread,
Sep 8, 2016, 6:31:11 PM9/8/16
to Zhang Bingjie, gen...@soe.ucsc.edu
Hello Zhang,

Thank you for your email. For danRer7, there is a tRNA Genes track which you can download in GTF format from the Table Browser. Select "GTF" as the output format. We do not have a similar track for rRNAs, however you can use the Table Browser and join the ensGene and ensemblSource tables like so:

1. Navigate to the Table Browser.
2. Select the appropriate clade, genome, assembly for danRer7
3. Make the following selections: group: Genes and Gene Predictions, track: Ensembl Genes, table: ensemblSource
4. Create a filter: "source does match rRNA", click "submit"
5. Select "selected fields from primary and related tables" as the output format.
6. Click "get output".
7. On the following page, select "ensGene" under Linked Tables, click "allow selection from checked tables"
8. Next select the appropriate fields you wish you have in your output, then click "get output".

If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Regards,
Luvina

--
Luvina Guruvadoo
UCSC Genome Browser

http://genome.ucsc.edu




--


Reply all
Reply to author
Forward
0 new messages