Re: questions about vcf2maf

148 views
Skip to first unread message

Cyriac Kandoth

unread,
Oct 28, 2015, 1:19:19 PM10/28/15
to Ewa A. Grabowska, cbiop...@googlegroups.com
Hi Ewa,

Let me CC the cBioPortal team, to make sure I'm not talking rubbish below...

1. Yes, cBioPortal expects a MAF format tab-delimited file, or at least the 4 minimum required columns explained here - https://github.com/cBioPortal/cbioportal/wiki/File-Formats#mutation-data.
2. vcf2maf uses VEP under the hood, which does not report Entrez IDs. It's not a required argument for cBioPortal. Let us know if something breaks (e.g. the protein structure view) and we can show you how to backfill Entrez IDs.
3. Amino_Acid_Change is equivalent to HGVSp_Short, and I believe cBioPortal treats that as a valid alias. If it does not, let us know so we can fix it. As a workaround, you may have to rename the column.
4. Thanks for that bug report. We'll handle it separately as a vcf2maf issue - https://github.com/mskcc/vcf2maf/issues/24 - can you update that issue with a test variant where ALLELE_NUM is not reported?

Cheers!

~Cyriac

On Fri, Oct 23, 2015 at 11:30 AM, Ewa A. Grabowska <egrab...@nygenome.org> wrote:
Hi Cyriac,

I’m a bioinformatics programmer at the New York Genome Center. I have been trying to set up a local version of cbioportal, unfortunately some of the SnpEff/GATK annotations are not really cbio-friendly. Therefore I decided to try out your tool: vcf2maf. Hope you can help me with a few problems that I’ve been having.

  1. First of all, my assumption is that the output from vcf2maf should be compatible with cbioportal: is that right?
  2. Cbioportal relies on providing Entrez_gene ids, vcf2maf seems to be setting all entrez ids to ‘0’  ($maf_line{Entrez_Gene_Id} = '0’;)
  3. Cbioportal takes an AA_change column which seems to be missing in vcf2maf output files.
  4. ALLELE_NUM for some reason is not always present in *.vep.vcf file, which causes some downstream problems (precisely this fragment is crashing: 
# Skip effects on other ALT alleles
push( @all_effects, \%effect ) if( $effect{ALLELE_NUM} == $var_allele_idx );)

Thanks so much in advance!
Best,

Ewa Grabowska


Ewa Grabowska | Bioinformatics Programmer
New York Genome Center
101 Avenue of the Americas
New York, NY 10013 
O: (646) 977-7039 | M: (917) 753-6442 
egrab...@nygenome.org
This electronic message is intended for the use of the named recipient only, and may contain information that is confidential, privileged or protected from disclosure under applicable law. If you are not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any reading, disclosure, dissemination, distribution, copying or use of the contents of this message including any of its attachments is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and destroy all copies of this message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email.

Reply all
Reply to author
Forward
0 new messages