Question

69 views
Skip to first unread message

DANIEL DE MATTOS CORREA

unread,
Jun 13, 2025, 4:03:32 PM6/13/25
to gen...@soe.ucsc.edu
Dear UCSC fellow, 

I have a question regarding the Table Browser Tool. I have been trying to create bed files with annotated genomic coordinates for exons +15pb padding of a given list of genes with the specifications bellow:

Track: NCBI RefSeq
Table: RefSeq Select and MANE (ncbiRefSeqSelect)

However, for most genes the annotation gives a somewhat weird exon0 and mistakes all exon numbering, usually assigning -1 exon to the annotation from RefSeq Select. 

I could find no reason for why this is happening nor a way to circumvent it and have it provide the proper RefSeq exon numbering. 

Hope you can help me with this. 

Best regards,

Dr. Daniel Mattos

Supervisor da Plataforma de Sequenciamento NGS do Laboratório de Diagnóstico Molecular (DIMOL) 

Unidade de Apoio ao Diagnóstico (UNADIG-RJ) 
Fiocruz | Vice-Presidência de Produção e Inovação em Saúde (VPPIS)

Gerardo Perez

unread,
Jun 19, 2025, 5:42:16 PM6/19/25
to DANIEL DE MATTOS CORREA, gen...@soe.ucsc.edu

Hello,

Thank you for your interest in the UCSC Genome Browser and for sending your inquiry.

The reason you are seeing an offset by -1 for exon numbers is due to zero-based numbering used in our BED format. We use zero-based numbering (http://en.wikipedia.org/wiki/Zero-based_numbering) for internal representation because it simplifies coordinate arithmetic. This numbering is also used in the coordinates of BED files. The following blog post has more details about how coordinates work in BED format and position format: https://genome-blog.gi.ucsc.edu/blog/2016/12/12/the-ucsc-genome-browser-coordinate-counting-systems/.

Unfortunately, as you are experiencing, this affects the exon number output when using BED format in the Table Browser. You can use an awk command to update the exon numbers. For example, if you save the output to a file, then from the command line where the file is located, run the following command:

awk 'BEGIN{OFS="\t"} {split($4, a, "_exon_"); tid=a[1]; exon[tid]++; sub(/_exon_[0-9]+_/, "_exon_" exon[tid] "_", $4); print}' table_browser_output.bed > table_browser_updatedExons.bed

The table_browser_updatedExons.bed file will have the exon numbers updated.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Gerardo Perez
UCSC Genomics Institute


--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/RO2P152MB7266F5D24D7A49F6E588C8958F77A%40RO2P152MB7266.LAMP152.PROD.OUTLOOK.COM.

DANIEL DE MATTOS CORREA

unread,
Jul 1, 2025, 2:46:10 PM7/1/25
to gen...@soe.ucsc.edu, Gerardo Perez

 
Dear Gerardo! Hello!

Thank you for your explanation!  I will try to run this code to update exon numbering. 

If I could make a suggestion, I would put an option in Table Browser to retrieve exon numbering as they are in RefSeq (or from any original annotation). Because this creates a huge confusion since we try to make a table with RefSeq coordinates but get a different annotation. 

Best wishes,

Dr. Daniel Mattos

Supervisor da Plataforma de Sequenciamento NGS do Laboratório de Diagnóstico Molecular (DIMOL) 

Unidade de Apoio ao Diagnóstico (UNADIG-RJ) 
Fiocruz | Vice-Presidência de Produção e Inovação em Saúde (VPPIS)


From: Gerardo Perez <gpe...@ucsc.edu>
Sent: 19 June 2025 18:42
To: DANIEL DE MATTOS CORREA <daniel...@fiocruz.br>
Cc: gen...@soe.ucsc.edu <gen...@soe.ucsc.edu>
Subject: Re: [genome] Question
 
Reply all
Reply to author
Forward
0 new messages