retrieve RNA exon/intron sequence info from coordinates

9 views
Skip to first unread message

Stephen Meltzer

unread,
Oct 20, 2017, 1:15:36 PM10/20/17
to gen...@soe.ucsc.edu, John Abraham, Yulan Cheng, Steffie Pitts, akiyuki....@dc.tohoku.ac.jp

Dear Genome:

 

I have the attached set of coordinates but am unable to retrieve their corresponding exact RNA (exon and intron) sequences.

I need these coordinates to be carried into the RNA sequence data.

I do not just want to pull the entire gene or cDNA.

How can I do this?

 

Stephen J. Meltzer, M.D.

American Cancer Society Clinical Research Professor

The Harry & Betty Myerberg/Thomas R. Hendrix Professor of Gastroenterology

Departments of Medicine and Oncology

The Johns Hopkins University School of Medicine & Sidney Kimmel Comprehensive Cancer Center

1503 E. Jefferson Street, Room 112

Baltimore, MD 21287

410-502-6071

smel...@jhmi.edu

410-502-6761 Lisa Gaines-Thomas

lgai...@jhmi.edu

 

rna_strands.csv

Matthew Speir

unread,
Oct 24, 2017, 1:26:06 PM10/24/17
to Stephen Meltzer, gen...@soe.ucsc.edu, John Abraham, Yulan Cheng, Steffie Pitts, akiyuki....@dc.tohoku.ac.jp
Hi Stephen,

Thank you for your question about obtaining sequence using the UCSC Genome Browser.

Could you provide more details as to what information you're looking to obtain? Are you looking for the genomic sequence of the coordinates you provided? Or something else?

Additionally, could you provide the assembly that you're working with (e.g. hg19, hg38, or mm10)?

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group
--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/0117cf4bdcc84848b18b43defebc1f6a%40ESGMTWEX12.win.ad.jhu.edu.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.

Stephen Meltzer

unread,
Oct 24, 2017, 2:07:39 PM10/24/17
to Matthew Speir, gen...@soe.ucsc.edu, John Abraham, Yulan Cheng, Steffie Pitts, akiyuki....@dc.tohoku.ac.jp

Dear Matthew,

 

Thank you for your reply! We are trying to generate putative circular RNA sequences.

All we have are the start and end coordinates of each potential circular RNA.

We would like to actually pull the RNA sequence between the two coordinates for each circular RNA.

That way, we can predict the complete sequences and sizes of the circular RNAs.

We also (crucially) need exon and intron information.

We are assuming the circular RNAs contain only exonic RNA.

 

Steve

Matthew Speir

unread,
Oct 24, 2017, 5:18:57 PM10/24/17
to Stephen Meltzer, gen...@soe.ucsc.edu, John Abraham, Yulan Cheng, Steffie Pitts, akiyuki....@dc.tohoku.ac.jp
Hi Stephen,

Thank you for providing more details.

You can get the sequences of the regions from the file you attached to your original message using a BED6 custom track and the Table Browser.

A. Create a BED6 file
   
You can find more information on the BED format here: https://genome.ucsc.edu/FAQ/FAQformat.html#format1

    The first six columns are:
        1. Chromosome name
        2. Chromosome start position (0-based, more here: http://genome.ucsc.edu/blog/the-ucsc-genome-browser-coordinate-counting-systems/)
        3. Chromosome end position (1-based)
        4. Item name
        5. Score (number between 0-1000, not important for your case, but needs to be filled in)
        6. Strand

        Example of creating a BED6 line out of the first region in your file:
            chr6    108663453    108664889    region1    1000    +

        You will need to create a BED line such as this for every region you want output for.

B. Upload BED6 file as a custom track
    After you've created your BED6 file of regions, you will need to upload it as a custom track:
    https://genome.ucsc.edu/cgi-bin/hgCustom
   
    You can paste the contents of your BED6 file into the box below "Paste URLs or data", or you can upload the file
    from your computer using the "Choose File" button. Then click "Submit".

    After upload, select "Table Browser" from the drop-down and click "Go".

C. Use the Table Browser to get sequence for your custom track
   
Make the following selections on the Table Browser, https://genome.ucsc.edu/cgi-bin/hgTables:

        clade: Mammal
        genome: Human
        assembly: Dec. 2013 (GRCh38/hg38)
        group: Custom Tracks
        track: Your_Track_Name
        table: primary table for your track
        region: genome
        output format: sequence
        output file: Enter a file name or leave blank to view results in the web browser

    After making these selections, click "get output". On the next screen, select any sequence formatting options
    you would like and then click "get sequence".

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Matthew Speir
UCSC Genome Bioinformatics Group


Reply all
Reply to author
Forward
0 new messages