Uploading germline VCF data into cBioPortal

217 views
Skip to first unread message

Abhijeet Malatpure

unread,
Feb 4, 2021, 4:08:54 PM2/4/21
to cBioPortal for Cancer Genomics Discussion Group
Hello cBioPortal gurus,
I have a cache of VCF files for germline samples that I would like to upload into a study on my local cBioPortal instance. They all have the following format:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  858444493-fy5hF6Dg
chr1    51972   .       GGAC    G       18      PASS    .       GT:AD:DP        1/1:0,6:6
chr1    121269  .       CT      C       70      PASS    OLD_VARIANT=1:121269:CTT/CT     GT:AD:DP        0/1:7,6:13
chr1    275583  .       TTTTA   T       99      PASS    OLD_VARIANT=1:275583:TTTTATTTATTTA/TTTTATTTA    GT:AD:DP        0/1:7,18:25
chr1    609440  .       C       T       99      PASS    .       GT:AD:DP        0/1:25,7:32
chr1    630555  .       C       T       99      PASS    .       GT:AD:DP        0/1:21,21:42
chr1    638771  .       A       T       99      PASS    .       GT:AD:DP        0/1:21,15:36
chr1    739602  .       A       AT      65      PASS    .       GT:AD:DP        0/1:15,5:20
chr1    749482  .       G       A       99      PASS    .       GT:AD:DP        0/1:17,7:24
chr1    757414  .       CA      C       13      PASS    .       GT:AD:DP        0/1:4,2:6
chr1    779968  .       T       G       99      PASS    .       GT:AD:DP        1/1:0,33:33
chr1    782207  .       G       A       99      PASS    .       GT:AD:DP        0/1:8,13:21
chr1    788418  .       CAG     C       87      PASS    .       GT:AD:DP        1/1:0,29:29
....

where 858444493-fy5hF6Dg is the germline sample ID.
I used VCF2MAF to convert these VCF files to MAF with the sample ID as --vcf-normal-id, and then merged them together in the end. However, the metaImport process shows the following error:

ERROR: data_mutations.maf: lines [3, 5, 11, (11209 more)]: column 16: Value in column 'Tumor_Sample_Barcode' is invalid; value encountered: ''

How should I correctly process germline data to be uploadable into cBioPortal?

Regards,
Abhi

JJ Gao

unread,
Feb 8, 2021, 4:00:15 PM2/8/21
to Abhijeet Malatpure, cBioPortal for Cancer Genomics Discussion Group
Hi Abi,

Would you please include a few lines of your MAF file for us to take a look at?

Thanks,
-JJ

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/2b807ad6-031c-41ef-9ed1-315638ccad33n%40googlegroups.com.

Abhijeet Malatpure

unread,
Feb 8, 2021, 5:41:25 PM2/8/21
to cBioPortal for Cancer Genomics Discussion Group
Hi JJ,
Please find a Germline MAF file attached here.Thank you for looking into this.

Best,
Abhi Malatpure
mutations_germline.maf.txt

JJ Gao

unread,
Feb 8, 2021, 6:40:16 PM2/8/21
to Abhijeet Malatpure, cBioPortal for Cancer Genomics Discussion Group
Hi Abhi,

It looks like the column "Tumor_Sample_Barcode" is empty. This can be addressed by following the steps on sample IDs https://github.com/mskcc/vcf2maf. Please let us know if you have additional questions.

Best,
-JJ

Malatpure, Abhijeet

unread,
Feb 8, 2021, 7:12:35 PM2/8/21
to JJ Gao, cBioPortal for Cancer Genomics Discussion Group
I used VCF2MAF to convert the germline VCF files to MAF. However, because they are germline samples (not tumor), I mapped them to the Matched_Norm_Sample_Barcode field during conversion instead. 

Is this not the correct way to process it?

Regards,
Abhi

On Feb 8, 2021, at 6:40 PM, JJ Gao <jianji...@gmail.com> wrote:


This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.

JJ Gao

unread,
Feb 8, 2021, 8:42:18 PM2/8/21
to Malatpure, Abhijeet, cBioPortal for Cancer Genomics Discussion Group
I see. We are actually not importing the Matched_Norm_Sample_Barcode and Tumor_Sample_Barcode is required. A workaround would be add the normal sample id as tumor id. 
Reply all
Reply to author
Forward
0 new messages