However, when I jump to uniprot protein api https://www.ebi.ac.uk/proteins/api/coordinates/location/P15056:100 I got: { "locations": [ { "accession": "P15056", "taxid": 9606, "chromosome": "7", "ensemblTranslationId": "ENSP00000493543", "proteinStart": 100, "geneStart": 140834550, "proteinEnd": 100, "geneEnd": 140834548 } ] }
We believe that Uniprot may be mishandling strand in this case. The transcript you're looking at is on the reverse strand of the genome. Position 100 of this protein is mapped to Exon 3 of this transcript: http://www.ensembl.org/Homo_sapiens/Transcript/Exons?g=ENSG00000157764;r= 7:140730665-140924928;t=ENST00000646891 genomic coordinates 140,834,872-140,834,609
I've highlighted the relevant codon in red. Each line is 60 bases long, so that codon is bases 58-60 of that exon. Since the exon is reverse stranded, this means it's genomic coordinates are the start of the exon (872) minus 58 or 60 + 1, ie 813-815.
UniProt seem to have counted from the other end (140,834,609), minus 58 or 60 + 1, giving 548-550. These coordinates are found in intron 3-4 and cannot be the amino acid coordinates.
I suggest contacting UniProt, as it seems likely that this error is common to all reverse strand mapped proteins.
All the best
Emily Ensembl helpdesk
I think Ensembl is correct, could you please check and fix it, I need your support.