Which column between Tumor_Seq_Allele1 vs. Tumor_Seq_Allele2 to select in TCGA MAF files from cBioPortal for the "alteration column allele" for re-annotation purposes?

717 views
Skip to first unread message

Vlachavas, Efstathios-Iason

unread,
Dec 14, 2021, 2:39:49 PM12/14/21
to cbiop...@googlegroups.com

Dear cBioPortal community,

 

I would like to ask you for a rather general but quite challenging question regarding the utilization of mutational MAF cancer files from distinct cancer types, for re-annotation and downstream analysis using cBioPortal; In detail, based on a current project analysis, of testing one variant scoring pipeline, I was trying to fetch some MAF files from distinct resources (here the “extended” MAF file from CPTAC CRC dataset):

 

# after loading the maf file and keeping specific columns:

 

head(xx2)

   Hugo_Symbol Chromosome Start_Position End_Position Strand Variant_Type Reference_Allele

1:      PTPN22          1      114380884    114380884                         +           SNP                A

2:        CD1D          1      158151458    158151458                           +          SNP                G

3:       ZMYM1          1       35580705     35580705                           +          SNP                C

4:       ITIH5         10        7679197      7679197                                 +          SNP                C

5:      LRRTM3         10       68687135     68687135                          +          SNP                G

6:       CDH23         10       73553339     73553339                            +          SNP                C

   Tumor_Seq_Allele1 Tumor_Seq_Allele2

1:                 A                 T

2:                 G                 A

3:                 C                 T

4:                 C                 T

5:                 G                 A

6:                 C                 A

 

My crucial question is that as I would like to re-annotate the available mutations for specific patients, I would need to fetch specific columns, two of these are the reference (normal) and tumor (alteration)

Allele; from the above information, there are two available tumor allele columns: Tumor_Seq_Allele1 and Tumor_Seq_Allele2; however, which one should I use as the “ alteration allele” ? based also on the

Reference_Allele column?

 

I also found some relative posts: https://www.biostars.org/p/424846/

https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#mutation-data

https://docs.gdc.cancer.gov/Data/File_Formats/MAF_Format/

 

but I did not find a clear answer how to proceed with this; Is there a direct approach? For example as Tumor_Seq_Allele1 is the same with the Reference_Allele column, I should choose the Tumor_Seq_Allele2?

 

Cheers,

 

Efstathios

 

Efstathios-Iason Vlachavas

Post-doc/Guest Scientist

 

German Cancer Research Center (DKFZ)

Foundation under Public Law

Im Neuenheimer Feld 280

69120 Heidelberg

Germany

phone: +49 6221 42-5123

fax:      +49 6221 42-5109

 

Efstathios-Ia...@dkfz-heidelberg.de  

www.dkfz.de

 

 

Management Board: Prof. Dr. med. Michael Baumann, Ursula Weyrich

VAT-ID No.: DE143293537

 

Ritika Kundra

unread,
Dec 28, 2021, 7:09:34 PM12/28/21
to Vlachavas, Efstathios-Iason, cbiop...@googlegroups.com
Hi Efstathios,

Apologies for the delay.

For annotation, please consider Reference Allele and Tumor Seq Allele_2 as reference and variant alleles. As you noticed, tumor_seq_allele_1 is the same as Ref_allele.

Thanks,
Ritika

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/44658e5d03654dc999d30bf56923bb52%40DKFZEX02N5.ad.dkfz-heidelberg.de.
Reply all
Reply to author
Forward
0 new messages