Mutsig reference files not accessible for download

1,531 views
Skip to first unread message

29april...@gmail.com

unread,
Jul 2, 2018, 9:13:04 AM7/2/18
to Gdac-users

Hello,


I’m running MutsigCV 1.4.1 and I cannot access all the required reference files from the download page nor from the “How to run MutsigCV page”.

I assume I need the following files :

Starting with v1.3 of the code, MutSigCV has a preprocessing module that takes care of organizing the "categ" and "effect" information.  This makes it easy to run MutSigCV when all you have is a MAF file.

To run MutSigCV in this way, please first download the following four reference files:

·        genome reference sequence:   chr_files_hg18.zip    or    chr_files_hg19.zip

o       unzip this file to yield a directory (chr_files_hg18/ or chr_files_hg19/) of chr*.txt files

·         mutation_type_dictionary_file.txt

·         exome_full192.coverage.txt.zip

o      unzip this file to yield exome_full192.coverage.txt

·         gene.covariates.txt

 

 

This is the command I’d used and specifically for this test run with LUAD data the mutation type dictionary file is missing.

./run_MutSigCV.sh /mnt/isilon/cbmi/variome/gaonkark/bcbio/anaconda/MutSigCV_1.41/MATLAB_RUN_Compliler/v901 LUSC.maf LUSC.coverage.txt LUSC.coverage.txt output.txt

------------------------------------------

Setting up environment variables

---

LD_LIBRARY_PATH is .:/mnt/isilon/cbmi/variome/gaonkark/bcbio/anaconda/MutSigCV_1.41/MATLAB_RUN_Compliler/v901/runtime/glnxa64:/mnt/isilon/cbmi/variome/gaonkark/bcbio/anaconda/MutSigCV_1.41/MATLAB_RUN_Compliler/v901/bin/glnxa64:/mnt/isilon/cbmi/variome/gaonkark/bcbio/anaconda/MutSigCV_1.41/MATLAB_RUN_Compliler/v901/sys/os/glnxa64:/mnt/isilon/cbmi/variome/gaonkark/bcbio/anaconda/MutSigCV_1.41/MATLAB_RUN_Compliler/v901/sys/opengl/lib/glnxa64

 

======================================

  MutSigCV

  v1.4

 

  (c) Mike Lawrence and Gaddy Getz

  Broad Institute of MIT and Harvard

======================================

 

 

MutSigCV: PREPROCESS

--------------------

Loading mutation_file...

Loading coverage file...

Processing mutation "effect"...

NOTE:  This version now ignores "is_coding" and "is_silent".

       Requires Variant_Classification/type column and mutation_type_dictionary so we can assign nulls.

Error using MutSigCV>MutSig_preprocess (line 291)

missing mutation_type_dictionary_file

 

Error in MutSigCV (line 184)

 

Please let me know how to access these files and if you need more information me about the issue.

 

Thank you,

Krutika

 

Michael Noble

unread,
Jul 5, 2018, 9:03:55 PM7/5/18
to 29april...@gmail.com, Gdac-users
Dear Krutika,

I'm sorry you're having difficulty with this tool. In the past the CGA group maintained a help forum at https://software.broadinstitute.org/cancer/cga, but that site is under transition and not being actively monitored.  No one in the current GDAC is an author of that tool, so the best advice I can give is that you try to contact the corresponding author(s) at the address given in the MutSig paper.

Best,
Mike
---
Michael S. Noble
Associate Director for Data Science
Cancer Genome Computational Analysis
Broad Institute of MIT and Harvard

--

Thank You,

Michael S. Noble
Associate Director for Data Science
Cancer Genome Computational Analysis

Krutika Gaonkar

unread,
Jul 6, 2018, 2:36:46 PM7/6/18
to Michael Noble, Gdac-users
Hello Michael,

Thank you for getting back to me. I'll get in contact with the authors of the tool as you suggested.

Thanks,
Krutika

beth.bou...@gmail.com

unread,
Jul 26, 2018, 4:41:23 PM7/26/18
to Gdac-users
Dear Krutika,

I am wondering if you were able to get in touch with the authors of the MutSig tool to obtain the file you needed.  I am also looking for a copy of the mutation_type_dictionary_file.txt that was previously available from the Broad Institute.

Hoping that you were able to solve your problem, so that you can help me : )

--
Beth B.

Krutika Gaonkar

unread,
Jul 27, 2018, 10:56:22 AM7/27/18
to beth.bou...@gmail.com, Gdac-users
Hi Beth,

I was pointed to download the latest version of MutSig2CV using the link below ( this includes the scripts and the reference files needed including the mutation_type_dictionary_file.txt) by Julian Hess <jh...@broadinstitute.org>. 


I download the tar file and was able to install and run the tool successfully using the README file. 

Hope that helps!

Thanks,
Krutika


--
You received this message because you are subscribed to a topic in the Google Groups "Gdac-users" group.
To unsubscribe from this topic, visit https://groups.google.com/a/broadinstitute.org/d/topic/gdac-users/CGrKr8ncq6s/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gdac-users+unsubscribe@broadinstitute.org.

Qi

unread,
Sep 16, 2018, 10:30:10 AM9/16/18
to Gdac-users
Hi Krutika,

I got the same trouble as you described earlier when tring to use MutSigCV. I need the reference files including exome_full192.coverage.txt and mutation_type_dictionary_file.txt. Unfortunately I couldn't open the MutSig2CV's url you provided. So I wonder that could you please send the tar file via email? My email address is karen...@gmail.com.

Thanks,
Qi

在 2018年7月27日星期五 UTC+8下午10:56:22,Krutika Gaonkar写道:
To unsubscribe from this group and all its topics, send an email to gdac-users+...@broadinstitute.org.

mckf...@gmail.com

unread,
Nov 15, 2018, 10:37:20 AM11/15/18
to Gdac-users, beth.bou...@gmail.com
Hi Krutika,

I could not download from the link your shared, would you be able to send the file to my email: mckf...@gmail.com, thank you very much!

Best,

Wenhu

在 2018年7月27日星期五 UTC+2下午4:56:22,Krutika Gaonkar写道:
To unsubscribe from this group and all its topics, send an email to gdac-users+...@broadinstitute.org.

Gdac-users

unread,
Nov 15, 2018, 11:46:55 AM11/15/18
to Gdac-users, beth.bou...@gmail.com
Hi Wenhu,

The link has changed due to internal IT changes, apologies for the confusion. The updated link is here. I'll try asking the developers to update their software page.

Regards,
David

--
David Heiman
Senior Software Engineer
GDAN Processing Genome Data Analysis Center
CPTAC Proteogenomic Data Analysis Center
The Broad Institute of MIT and Harvard
415 Main Street
Cambridge, MA 02142

Wenhu Cao

unread,
Nov 15, 2018, 11:53:37 AM11/15/18
to gdac-...@broadinstitute.org
Hi David,

Thanks very much for your prompt reply and useful link!

BW,

Wenhu

无病毒。www.avast.com

Gdac-users <gdac-...@broadinstitute.org> 于2018年11月15日周四 下午5:46写道:


--
Wenhu Cao (曹文虎)

Molecular Diagnostics of Oncogenic Infections
PhD student

Wechat: mckf111
 
German Cancer Research Center (DKFZ)
Foundation under Public Law
Im Neuenheimer Feld 280
69120 Heidelberg
Germany

无病毒。www.avast.com

kog...@gmail.com

unread,
Dec 4, 2018, 8:03:50 PM12/4/18
to Gdac-users
Hi,

Thank you for the useful MutSig2CV link above.
It did run almost correctly in my environment, and I'm wondering if I can configure the gene target list.
Should I override reference/target_list.hg19.v1a.txt or point some arguments to MutSig2CV exec file?

Also, I didn't find the test directory which will contain test/input/params.txt etc.
Is it provided somewhere separately?

Best,
Yasunori Kogure

ja...@kubikova.eu

unread,
Feb 8, 2019, 9:58:23 AM2/8/19
to Gdac-users, beth.bou...@gmail.com
Hi,
I downloaded MutSig2CV.tar from the link below but after unpacking there is no MutSigCV.m nor run_MutSigCV.sh mentioned in the installation guide. Can you help me out, please?
Thanks
Jana

Gdac-users

unread,
Feb 8, 2019, 10:13:26 AM2/8/19
to Gdac-users, beth.bou...@gmail.com
Hi Jana,

Please follow the instructions in the README.txt file in the archive you downloaded. The website has not been updated with documentation for MutSig2CV.

Regards,
David

--
David Heiman
Senior Software Engineer
GDAN Processing Genome Data Analysis Center
CPTAC Proteogenomic Data Analysis Center
The Broad Institute of MIT and Harvard
415 Main Street
Cambridge, MA 02142

guog...@gmail.com

unread,
Feb 13, 2019, 12:54:46 PM2/13/19
to Gdac-users
Hi,

Did you solve the problem? I also need to find the parameter file because I really do not want mutsig2cv to remove any duplicated samples.

Thank you!

Ying

guog...@gmail.com

unread,
Feb 13, 2019, 12:54:46 PM2/13/19
to Gdac-users
Hi there,

Thank you for the link to mutsig2cv. It works! The only thing I'm missing is the test/ folder containing the parameter file. Can anyone help to post it if you have it? For your reference, below is what I copied from the readme file. Thanks in advance!



**************************
*
* MUTSIG CONFIGURATION:
*
**************************

MutSig's algorithm and run parameters are configured via a two column, tab-
delimited text file.  Here is a list of all available parameters, possible
options, and default values.  A sample parameters file (set to defaults) can
be found in test/input/params.txt.

number_of_categories_to_discover: Integer.  Number of SNV categories to discover
  (see "mutcategs.txt" section above for details.) Default: 5

skip_permutations: Boolean.  If true, MutSig will not perform
  clustering/functional significance tests; pCL and pFN will be set to NaN.
  Default: false

maxperm: Integer.  Specifies maximum number of permutations to perform for
  pCL/pFN permutation tests.  Default: 1e5

remove_duplicate_patients: Boolean.  Finds and removes duplicate patients in the
  cohort by comparing mutation overlap between patients.  This should be
  disabled when high levels of overlap are expected between samples (e.g.
  primary/met combined cohorts).  Default: true

Gdac-users

unread,
Feb 13, 2019, 2:07:17 PM2/13/19
to Gdac-users
Hi Ying,

I've contacted the developer, and will let you know when the archive has been updated with an example params.txt file.

Regards,
David

--
David Heiman
Senior Software Engineer
GDAN Processing Genome Data Analysis Center
CPTAC Proteogenomic Data Analysis Center
The Broad Institute of MIT and Harvard
415 Main Street
Cambridge, MA 02142

Y

unread,
Feb 13, 2019, 7:36:12 PM2/13/19
to gdac-...@broadinstitute.org
Thank you David. Really appreciate your help!

Have a nice evening.

Ying 

On Feb 13, 2019, at 1:43 PM, Broad Institute GDAC <gd...@broadinstitute.org> wrote:

Hi Ying,

I've contacted the developer, and will let you know when the archive has been updated with an example params.txt file.

Regards,
David

--
David Heiman
Senior Software Engineer
GDAN Processing Genome Data Analysis Center
CPTAC Proteogenomic Data Analysis Center
The Broad Institute of MIT and Harvard
415 Main Street
Cambridge, MA 02142

ja...@kubikova.eu

unread,
Feb 16, 2019, 7:08:23 AM2/16/19
to Gdac-users
Thank you very much David.
I managed to install and run MutSig2CV based on the README.txt.

The results, however, are completely (but completely) different when compared to on-line MutSigCV 1.3 tool (ran on the same dataset)
The reason might be I supplied no coverage nor covariates file. While previous versions of MutsigCV used these files, the current installation (linked above) does not seem to require them. README.txt does not discus the topic. 

So my questions are:
  • do I need those files? eg,
    • am I supposed to create (CovGen) and supply a file with the coverage information?
    • the same with covariates file?
  • if yes how do I specify the files while invoking MutSig2CV from command line - in the params file?
    •  MutSig2CV <MAF file> <output directory> [params file]
Best regards,
Jana Kubikova

Gdac-users

unread,
Mar 25, 2019, 3:52:26 PM3/25/19
to Gdac-users
Hi Jana,

Thank you for your patience. MutSig2CV is not simply a new version of MutSigCV, but a collection of new and updated algorithms, for which the results are not expected to be the same as MutSigCV. For differences within our own results, please see firebrowse.org.

The coverage files are no longer required, and the covariates table is baked into the files you downloaded (reference/covariates_transformed.v5a.txt).

More details and publication references can be found in our FAQ.

Regards,
David
Reply all
Reply to author
Forward
0 new messages