Hi,
I downloaded the “RefSeq Curated (ncbiRefSeqCurated)” table as a txt file, and I see the mitochondrial gene names there don’t match the HGNC standard. It happen for all mitochondrial genes. These are the coding genes’ names in the table:
ND1
ND2
COX1
COX2
ATP8
ATP6
COX3
ND3
ND4L
ND4
ND5
ND6
CYTB
And these are their names in HGNC, in the same order:
MT-ND1
MT-ND2
MT-CO1
MT-CO2
MT-ATP8
MT-ATP6
MT-CO3
MT-ND3
MT-ND4L
MT-ND4
MT-ND5
MT-ND6
MT-CYB
Is this done on purpose?
Thank you,
Yaara
Hello Yaara,
Thank you for contacting the Genome Browser support team with your question about mitochondrial gene symbols.
I understand your surprise at the difference in gene symbols between our RefSeqCurated dataset and the HGNC official nomenclature. The data we display comes directly from the NCBI RefSeq annotation downloads, where the gene symbols we display are exactly what exists in the "gene_id" field, which never includes the "MT-" prefix that you see on the HGNC and RefSeq gene site. For example, here is the line from the RefSeq GTF file in comparison to the RefSeq gene page:
https://www.ncbi.nlm.nih.gov/gene?cmd=retrieve&list_uids=4541
In other words, we report the data exactly as we receive it and in these cases, the genes do not contain the "MT-" prefix. There could be a reason within NCBI RefSeq, but I could not find any information about that. I have contacted NCBI directly to inquire, but I would not expect any changes to their systems. Your best option may be to create a work-around for Mitochondrial genes.
For further communication, please reply-all to gen...@soe.ucsc.edu. Those emails are archived in a public help forum. For private questions, you may send emails instead to genom...@soe.ucsc.edu.
All the best,
Daniel Schmelter
UCSC Genome Browser
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/VI1PR03MB380834B4C4481FC479C55376D70A9%40VI1PR03MB3808.eurprd03.prod.outlook.com.
Hello Daniel,
Thank you for your answer.
I already created the work around you mentioned, but will be happy none the less to be informed what is NCBI’s answer.
Thank you,
Yaara
Hi Daniel,
I have a question regarding RefSeq genes as represented in the UCSC downloads.
We noticed that lately the mitochondrial genes were removed from the hg38 RefSeq gtf file (they still exist in hg19).
This is the file I’m downloading that doesn’t have the mitochondrial genes:
https://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/ncbiRefSeq.txt.gz
I didn’t manage to find any documentation for this in NCBI or UCSC. Do you know why this happened?
Thank you,
Yaara Unger, Geneyx
From: Daniel Schmelter <dsch...@ucsc.edu>
Sent: Wednesday, 23 June 2021 2:16
To: Yaara Unger <ya...@geneyx.com>
Cc: gen...@soe.ucsc.edu
Subject: Re: [genome] Mitochondria gene names in RefSeq tables
Hello Yaara,
Hello, Yaara.
Thank you for your interest in the Genome Browser and for your follow-up question.
The data we display comes directly from the NCBI RefSeq annotation downloads. We reached out to NCBI on why chrM annotations are not making it into the ncbiRefSeq dataset for hg38.
NCBI noted that 15-20 years ago, they established a standard set of genes names that they use across all vertebrate mitochondrial annotations and that these names may differ from the HGNC names. I would encourage you to reach out to NCBI and the RefSeq team about using HGNC names in their mitochondrial annotations: https://www.ncbi.nlm.nih.gov/projects/RefSeq/update.cgi
We do offer the GENCODE Genes V38 track for hg38 which has the HGNC gene names.
I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
Gerardo Perez
UCSC Genomics Institute
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/VI1PR03MB3808CC63518FA0E9A26C5C47D75D9%40VI1PR03MB3808.eurprd03.prod.outlook.com.