Good Morning:
The examples you mention I believe exist both in hg18 and hg19.
This shell procedure obtains results from each database:
for G in ENSG00000026103 ENSG00000030110 ENSG00000104725 ENSG00000104774
do
echo -n "hg19 ${G}: "
hgsql -N -e "select X.*, G.* from ensGene as G, knownToEnsembl as KE,
kgXref as X where G.name=KE.value and KE.name=X.kgID and
G.name2=\"${G}\" limit 1;" hg19
echo -n "hg18 ${G}: "
hgsql -N -e "select X.*, G.* from ensGene as G, knownToEnsembl as KE,
kgXref as X where G.name=KE.value and KE.name=X.kgID and
G.name2=\"${G}\" limit 1;" hg18
done
However, please keep in mind. The Ensembl gene track has more
annotations than in the UCSC gene track.
Not all Ensembl gene annotations have a corresponding UCSC gene.
Not all UCSC genes have a corresponding Ensembl gene.
The counts are:
hg18 Ensembl genes v54 May 2009: 63,280, UCSC genes Aug 2009: 66,803, knownToEnsembl: 60,456
hg19 Ensembl genes v63 Jun 2011: 173,742, UCSC genes Oct 2009: 77,614, knownToEnsembl: 75,160
The knownToEnsembl counts are the number of UCSC genes that correspond to
an Ensembl transcript ID. A single UCSC gene can correspond to a number of
different Ensembl transcript IDs. The counts of the unique number of Ensembl transcript
IDs in the knownToEnsembl tables are: hg18: 30,209, hg19: 46,319
and the number of Ensembl transcripts in the table ensPep are: hg18: 47,509, hg19: 90,720
Therefore, the coverage of Ensembl genes via knownToEnsembl is:
hg18: 30209/47509 == %63, hg19: 46319/90720 == %51
You will not always find UCSC genes for Ensembl transcripts.
--Hiram