Dear cBioPortal Team,
I would like to use your database and the databases UniProt to create a list of relevant domains in certain genes.
I know that cBioPortal is hg19 annotated and UniProt is hg38, but that shouldn't matter for the amino acid sequence, should it?
For example, for ZRSR2, UniProt gives position 198-304 for the RRM domain, while cBioPortal gives 246-302 (and this with the same RefSeq and ensemble annotation).
Why is there a difference and how to convert both information?
Thank you :-) Larissa
Hi Larissa,
Thanks so much for reporting this! Apologies for the delay. We are still trying to work out what the issue is here. Interestingly for hg38 studies in cBioPortal the domain is at 241-297. Filed a ticket:
https://github.com/genome-nexus/genome-nexus/issues/747
Will keep you posted
Best wishes,
Ino
--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
cbioportal+...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/cbioportal/4f0ceaf74c904755820e9c520a747182%40uk-koeln.de.
Hi Larissa,
Xiang found the reason (see more details here). The RRM domain listed on Uniprot is based on Prosite, whereas the cBioPortal one is based on PFAM:
See Browse Tab on this page: https://www.ebi.ac.uk/interpro/protein/UniProt/Q15696/
The way the domains are identified is different for these resources, which is probably why their positions differ. We are not entirely sure yet which one is more accurate, it might depend on your use case
We are thinking that an improvement could be for our visualization to also show Uniprot’s Prosite domains in addition to the PFAM ones
Best wishes,
Ino
.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/BL1PR18MB420001B3DC27E1ACBA7BF487D5B02%40BL1PR18MB4200.namprd18.prod.outlook.com.