Dear Max,
Thank you for using the UCSC Genome Browser and your question about why the first exon is always Exon 0.
You may be interested to know also that BED files use a zero-based numbering system, where the first coordinate of a gene is "offset" by one compared to what is displayed in the browser. Please read more about on our FAQ page:
http://genome.ucsc.edu/FAQ/FAQtracks#tracks1
You may also want to search our mailing list archives for discussions about exon numbering:
https://groups.google.com/a/soe.ucsc.edu/forum/?hl=en&fromgroups#!searchin/genome/exon$20numbering There are many discussion topics, and it is important to not interpret our numbering as the final say of how a gene is annotated. For example, in rare situations exon and intron numbers can be thrown off by small indels in the alignment of mRNAs to the genome. Also there are historical gene annotation situations where scientists in a gene community starting their codon and exon numbering at a place other than the beginning of the coding sequence, for example with the collagen gene. In summary, our exon numbers should always come with caveats.
Thank you again for your inquiry and using the UCSC Genome Browser. If you have any further questions, please reply to
gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to
genom...@soe.ucsc.edu.
All the best,
Brian Lee
UCSC Genome Bioinformatics Group