This repository contains a simple Excel spreadsheet for creating oncoplots to illustrate genetic mutations in a patient population. It does not require or depend on any other software to use. Edit the gray-shaded cells with a designated mutation type ID (below) to identify the observed mutation (rows) for each patient (columns).
This repository contains a simple Excel spreadsheet for creating swimmer plots to illustrate patient stories. It does not require or depend on any other software to use. Edit any of the cells with blue text to customize the swimmer plots which are available in both horizontal and vertical orientations.
CoMut plot is widely used in cancer research publications as a visual summary of mutational landscapes in cancer cohorts. This summary plot can inspect gene mutation rate and sample mutation burden with their relevant clinical details, which is a common first step for analyzing the recurrence and co-occurrence of gene mutations across samples. The cBioPortal and iCoMut are two web-based tools that allow users to create intricate visualizations from pre-loaded TCGA and ICGC data. For custom data analysis, only limited command-line packages are available now, making the production of CoMut plots difficult to achieve, especially for researchers without advanced bioinformatics skills. To address the needs for custom data and TCGA/ICGC data comparison, we have created CoMutPlotter, a web-based tool for the production of publication-quality graphs in an easy-of-use and automatic manner.
We introduce a web-based tool named CoMutPlotter to lower the barriers between complex cancer genomic data and researchers, providing intuitive access to mutational profiles from TCGA/ICGC projects as well as custom cohort studies. A wide variety of file formats are supported by CoMutPlotter to translate cancer mutation profiles into biological insights and clinical applications, which include Mutation Annotation Format (MAF), Tab-separated values (TSV) and Variant Call Format (VCF) files.
In summary, CoMutPlotter is the first tool of its kind that supports VCF file, the most widely used file format, as its input material. CoMutPlotter also provides the most-wanted function for comparing mutation patterns between custom cohort and TCGA/ICGC project. Contributions of COSMIC mutational signatures in individual samples are also included in the summary plot, which is a unique feature of our tool.
With the rapid evolution of next-generation technologies (NGS) combined with dropping costs, whole-exome sequencing (WES) has become a widely-accepted application for clinical research and diagnostic purposes. In the past few years, over 10,000 exomes across 40 distinct types of human cancer were generated by The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC). The Broad institute has released the GATK Best Practice workflow tailored to somatic variant discovery. Researchers can follow this standardize analysis protocol, making their results comparable to TCGA/ICGC projects. Variant annotation is a relatively mature and feasible work because of the state-of-the-art packages like ANNOVAR [1], VEP [2], SnpEff [3], and Oncotator [4]. However, an intuitive and convenient way for visualizing and interpreting genomic data from high-throughput technologies continues to be challenging. Inconsistent file formats used in handling mutation profiles may introduce additional problems in subsequent data integration, visualization and comparison.
CoMut plot [5,6,7] is widely used in cancer research publications as a visual summary of mutational landscapes in cancer cohorts. This summary plot can inspect gene mutation rate and sample mutation burden with their relevant clinical details, which is a common first step for analyzing the recurrence and co-occurrence of gene mutations across samples. There are two web -based applications, the cBioPortal [8] and iCoMut ( ), which allow users to create intricate visualizations from pre-loaded TCGA data. For custom data analysis, only certain file formats such as MAF and TSV format are supported at this stage, which are based on command-line packages [6, 7], making the production of customizable plots difficult to achieve, especially for non-bioinformatics researchers.
To address the needs for custom data and TCGA/ICGC data comparison, we have created CoMutPlotter, a web-based tool, for the production of publication quality graphs and to translate cancer mutation profiles into biological insights and clinical applications. A wide variety of file formats are supported by CoMutPlotter, which include Mutation Annotation Format (MAF), Tab-separated values (TSV) and Variant Call Format (VCF) files. It is worth noting that CoMutPlotter is the first tool of its kind that directly supports VCFs, a dominant output format of all variant discovery pipelines like the GATK Toolkit [9], VarScan [10], and SAMtools [11]. Deciphering signatures of the mutational processes in human cancer is a new trend in cancer research community [12,13,14] because these signatures are footprints of molecular aberrations occurring in tumors. Alexandrov et al. identified a list of 30 reference signatures and about half of these signatures can be attributed to endogenous processes such as enzymatic activity of DNA cytidine deaminases (AID/APOBEC), the deficiency of DNA mismatch repair, or mutations in POLE and to exogenous mutagens like tobacco, ultraviolet light and toxic chemicals [15].
Our specific aim to construct CoMutPlotter is to lower the barriers between complex cancer genomic data and researchers. In addition to specifying the mutation burden and types of individual samples, we also allow the user to plot clinical features with their respective samples, providing intuitive access to mutational profiles from TCGA/ICGC as well as custom cohort studies alongside their clinical attributes. CoMutPlotter also provides the most-wanted function for comparing mutational landscapes between custom cohort and TCGA/ICGC project. To gain insight into the mutational processes that have altered the cancer genome, contributions of COSMIC signatures are quantified at sample resolution and integrated in the summary plot as dot matrix, which is a unique feature of CoMutPlotter. CoMutPlotter is freely available at
CoMutPlotter provides an intuitive web interface to receive mutation profiles obtained from cancer sequencing projects. Mutation Annotation Format (MAF) is widely used in TCGA cancer studies for storing mutation profiles, which is also the basis for many downstream analyses such as variant annotation, driver gene detection, mutual exclusivity analysis, and mutational signature identification. In addition to MAF file, CoMutPlotter also includes function to convert ICGC tab-separated values (TSV) file and standard Variant Call Format (VCF) file to MAF file, making this tool more accessible to wider researchers. CoMutPlotter not only provides complete functions for performing analyses mentioned above but also creates an interactively framework to present and summarize the important characteristics of the multidimensional analysis results from a custom cancer cohort. For the convenience of comparative analysis between custom data and TCGA/ICGA data, 73 mutation profiles were downloaded from TCGA and ICGC Data Portal and compiled as pre-loaded database. The PHP and R script are used to summarize all the generated results into an integrative plot to grasp the global characteristics of a mutation profile and to reveal the co-occurrence of mutations and samples. Download links are also provided to download publication-quality figures, significantly mutated gene list and detailed annotation table (Fig. 1).
CoMutPlotter accepts three dominant formats of mutation profiles, including MAF, TSV, and VCF formats. To make data management and analysis more efficient, mutation profiles in diverse formats are converted to MAF format before entering subsequent analyses. A custom script for file format conversion is available for download ( _tutorial/implementation.html#for-custom-study-with-large-number-of-vcf-files) when users try to deal with a study cohort with large number of VCF files. To perform in-depth comparisons between clinical features or study designs within a cancer cohort, the demographic profile can also be uploaded along with the mutation profiles. Detailed instructions on the usage of the custom script and the acceptable format of the demographic file can be found on the tutorial page ( _data_input).
International cancer projects are underway through The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) aim to establish a comprehensive catalogue of cancer associated genes across all cancer types. However, most of the existing analytical methods fail to account for mutational heterogeneity that affects the background mutation rate and may led to the identification of many specious genes. Lawrence et al. has developed a new method, named MutSigCV [17], to address the issue of mutational heterogeneity, which is correlated with transcriptional activity, DNA replication timing, and mutation frequency variability across patients. To facilitate the identification of genes truly associated with cancer and to make driver gene detection more accessible to users, CoMutPlotter has incorporated MutSigCV as a critical analysis module. The mutation profiles uploaded by users are converted to MAF format as mentioned above and then subjected to MutSigCV to determine significantly mutated genes with false discovery rates (q-value) less than or equal to 0.1. Since the mutation profiles of 73 cancer projects have been downloaded from TCGA/ICGC Data Portal, we also applied the MutSigCV method to identify diver genes in individual cancer projects. Based on the pre-calculated results, users can easily compare the resulting gene lists between custom study cohort and published cancer projects.
Mutational signatures are patterns of somatic mutations hidden in cancer genomes, which can be represented as different combinations of 96 available trinucleotide mutation contexts. Each mutational signature may be associated with specific kinds of mutational processes resulting from exogenous and endogenous mutagens such as ultraviolet radiations, tobacco-related exposures and abnormal activity of enzymes. Up to date, 30 distinct mutational signatures have been identified and categorized in COSMIC database using the WTSI Mutational Signature Analysis Framework [12]. However, large cohorts and sufficient computing resources are required by existing analysis framework of WTSI. Moreover, quantifying known signatures in individual samples is not possible under the current WTSI framework when sample sizes are small. For known signatures identification and quantification, the R deconstructSigs package [18] was used to determine the composition of mutational signatures in individual tumor samples. A dot matrix plot is used to show the percentage contribution of the identified signatures in each sample. The proposed etiology of each signature can be downloaded as a summary table, which may be beneficial to explore different combinations of mutational signatures that are representative in distinct groups of patients, to depict potential therapeutic targets and to reveal new connections between mutational processes and clinical features.
c80f0f1006