how to get first exon in bed file

45 views
Skip to first unread message

yujin kim

unread,
Oct 18, 2023, 11:37:39 AM10/18/23
to UCSC Genome Browser Public Support
Dear UCSC Genome Browser team,

Thank you for a great browser.

I have one question.

How can I get the first exon of the gene (e.g., PSMB9) through Table Broswer?'

I want to know their genomic range.

Many thanks,

Yujin Kim

Jairo Navarro Gonzalez

unread,
Oct 24, 2023, 8:07:21 PM10/24/23
to yujin kim, UCSC Genome Browser Public Support

Hello,

Thank you for using the UCSC Genome Browser and sending your inquiry.

If you are unfamiliar with using the Table Browser, you may find the following user guides useful:

Genome Browser training page: https://genome.ucsc.edu/training/
Table Browser user guide: https://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html

There is also a video tutorial for obtaining coordinates and sequences of gene exons:

https://genome.ucsc.edu/goldenPath/help/hgTablesHelp.html#codonSeq

You can following these steps to get the coordinates for the first exon in PSMB9 using the NCBI RefSeq track for hg19:

Example Table Browser Steps:

1. Configure the Table Browser

clade: Mammal
organism: Human
assembly: Feb. 2009 (GRCh37/hg19)
group: Genes and Gene Predictions
track: NCBI RefSeq
table: RefSeq All (ncbiRefSeq)
region: genome
output format: BED - browser extensible data

2. Filter for the gene symbol

Next to the identifiers (names/accessions): setting, click the paste list button, and you will be taken to a new page. on the new page, you can then enter the "PSMB9" gene symbol.

Next, click submit

3. Get output

After creating the filter, you can create a BED file from the results by changing the output format drop-down menu or download the data to a file by adding an output filename.

Once you have configured the output settings, click the get output button. You will be taken to a new page where you can select "Exon plus" to get a BED line for each exon for the transcript. Then, click get BED. Depending on your output choices, you will see the data in your web browser or begin downloading a file.

4. Process the output

You should see the output like the following:

chr6    32821968    32822066    NM_002800.5_exon_0_0_chr6_32821969_f    0    +
chr6    32823914    32823982    NM_002800.5_exon_1_0_chr6_32823915_f    0    +
chr6    32825039    32825171    NM_002800.5_exon_2_0_chr6_32825040_f    0    +
...
chr6_ssto_hap7    4256554    4256684    NM_002800.5_exon_3_0_chr6_ssto_hap7_4256555_f    0    +
chr6_ssto_hap7    4256913    4257055    NM_002800.5_exon_4_0_chr6_ssto_hap7_4256914_f    0    +
chr6_ssto_hap7    4257954    4258398    NM_002800.5_exon_5_0_chr6_ssto_hap7_4257955_f    0    +

For most purposes, you can ignore the haplotype sequences (_hap). In the fourth column, the name field contains the exon number, with the first exon having an item name of "NM_002800.5_exon_0_0_chr6_32821969_f". So the first exon coordinates for the PSMB9 gene is:
"chr6 32821968 32822066"

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro
UCSC Genome Browser


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/165cbd7f-147d-4b1f-96b6-d141727f2d24n%40soe.ucsc.edu.
Reply all
Reply to author
Forward
0 new messages