Hi Ramakrishnan,
please note protein changes (HGVSp or column 36) from mutation data you downloaded all have a prefix of "p.", while oncoKB does not use it.
For instance,
$ grep E17K msk_impact_2017/data_mutations_extended.txt |head -1
AKT1 GRCh37 14 105246551 105246551 + missense_variant,splice_region_variant Missense_Mutation SNP C C T P-0000004-T01-IM3 202 244 ENST00000349310.3:c.49G>A p.Glu17Lys p.E17K ENST00000349310 NM_001014432.1 17 Gag/Aag 0
$ grep E17K ~/Downloads/allAnnotatedVariants.txt |head -1
ENST00000349310 NM_001014431.1 207 AKT1 E17K E17K Oncogenic Gain-of-function 23134728, 20440266, 23741320, 18256540, 21793738, 17611497, 9843996
could that be the cause of mismatch?
Best!
Kelsey
HI Kelsey,
Thank you for getting back to me so quickly
I am currently mapping the cbioportal data with oncoKB data which I downloaded from datahub. When I am trying to map these two datasets I find there is no match between them. So can you please help me