Re: MARATHON pipeline

37 views
Skip to first unread message

Gene Urrutia

unread,
Jun 14, 2018, 12:53:04 PM6/14/18
to Jiang, Yuchao, canopy_phylogeny, Oliver Zill, n...@wharton.upenn.edu
Hi Oliver,

Thank you for your interest in Canopy and the Marathon pipeline.

Canopy uses data from multiple temporally and/or spatially separated tumor samples with matched controls.  The data from a single matched sample is insufficient to gain information about tumor phylogeny.  Canopy can be run with a minimum of 2 samples. 

Falcon can be run with a single tumor / normal pair to infer allele specific copy number. 

Please see table 1 at https://github.com/yuchaojiang/MARATHON for more information regarding inputs to each analysis type.

I am copying this to the canopy user group.

Thanks,
Gene

On Wed, Jun 13, 2018 at 6:34 PM, Jiang, Yuchao <yuc...@email.unc.edu> wrote:
Hi Oliver,

Thanks for your interest. Gene who is maintaining MARATHON is cc’ed here and can help you with your questions.

Yuchao

On Jun 13, 2018, at 4:07 PM, Oliver Zill <zill....@gene.com> wrote:

Dear Drs. Jiang and Zhang,

I am interested in trying out your MARATHON pipeline software for tumor copy number and phylogeny estimation on some cancer samples.  First of all, thank you very much for putting all these tools together and making them open source, I really appreciate it.  In my particular case, I am interested in determining allele-specific copy number and mutation clonality from paired tumor/normal WES, but using single samples only (i.e., a single tumor region plus matched normal, not multiple tumor samples as input).  Do FALCON and Canopy both support single-sample T/N analysis?  My impression from reading the Canopy paper is that this tool operates on multi-sample data when it reconstructs the tumor phylogeny.  It is a bit difficult to tell from the MARATHON github page and Bioinformatics paper if single-T/N analysis is supported.  Please let me know if it is possible to use it on single-sample data, and if so, are there any pointers in the documentation about how to properly configure the pipeline for single-sample vs multi-sample analysis?
Thanks very much,

Oliver Zill

--
Oliver Zill, PhD
Bioinformatics Scientist 
Genentech, Inc.


Oliver Zill

unread,
Jun 14, 2018, 1:02:59 PM6/14/18
to Gene Urrutia, Jiang, Yuchao, canopy_phylogeny, Oliver Zill, n...@wharton.upenn.edu
Hi Gene,

Thanks for your reply.  Can you also tell me whether MARATHON supports the hg38 assembly, in addition to hg19?
Thanks again,

Oliver

Jiang, Yuchao

unread,
Jun 14, 2018, 2:28:37 PM6/14/18
to Oliver Zill, Gene Urrutia, Jiang, Yuchao, canopy_phylogeny, n...@wharton.upenn.edu
Hi Oliver,

FALCON takes as input allele-specific read counts at germline heterozygous loci and MARATHON takes as input somatic SNVs and CNAs (CNAs are output from FALCON). All of these are independent of the genome assembly. Which genome reference you use only affects the calling of germline heterozygous loci and somatic SNVs.

Hope this clarifies.
Yuchao
Reply all
Reply to author
Forward
0 new messages