Hi
I had downloaded the refgene file (hg19 based co-ordinates) from the UCSC Table browser 2 days back.
However I see that there are multiple entries in it with same transcript ID and Gene Symbol but different chromosomal location and even strand info. I am not sure which one is correct / most updated and which one should I use for my annotations. Some examples are below. There are approximately 600 such transcripts.
NM_001005277 chr1 + 367658 368597 OR4F16
NM_001005277 chr1 - 621095 622034 OR4F16
NM_001005277 chr5 + 180794287 180795226 OR4F16
NM_001001722 chrY - 19990139 19992099 CDY2B
NM_001001722 chrY + 20137667 20139627 CDY2B
NM_000513 chrX + 153485202 153499470 OPN1MW
NM_000513 chrX + 153448084 153462352 OPN1MW
NM_001001435 chr17 + 34538467 34540274 CCL4L1
NM_001001435 chr17 + 34640033 34641840 CCL4L1
Could you please suggest how to obtain a file with unique chromosomal location entry for each refseq/transcript ID.
Thanks
Regards
--
Rahul Nahar, PhD
Scientist
Ocimum Biosolutions
Hyderabad, India.
The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD |
--