Hello Archana,
Thank you for your question about the knownToEnsembl table and reporting a strange behavior.
The duplication you saw was actually an intentional decision, made in order to keep consistency for existing pipelines. The knownToEnsembl table shows the connection between our knownGene primary geneIDs and Ensembl geneIDs. UCSC switched our main geneIDs to the Ensembl identifiers about 7 months ago, thus both columns should be the same. If you are using hg38's knownGene, you may not have to convert gene IDs at all, since knownGene.txt has both Ensembl and UCSC identifiers. You can download the knownGene file for hg38 from our download site here:
http://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/knownGene.txt.gz
If you would like a file with just those two columns, you can extract those name columns with an awk command like the following:
awk 'BEGIN {FS="\t"} {print $1, $12}' knownGene.txt
For hg19, the knownToEnsembl table still contains the UC to Ensembl conversion columns you may have expected. That data for hg19 can be accessed here: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/database/knownToEnsembl.txt.gz
I hope that was helpful. Thank you for writing in!
Kindly,
Daniel Schmelter
UCSC Genomics Institute
If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.
--
---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAH2Kb95kMHyXK%3DxZkZOVhOEfEnx5qzKqWqwQ%3DnnDFJQyjxH5qQ%40mail.gmail.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.