Selection of variants and CN

Vince R

unread,

Sep 28, 2016, 11:22:48 AM9/28/16

to canopy_phylogeny

Dear Yuchao,

Thanks for the excellent tools you are developping, I'm a happy user of CODEX and have tried Canopy recently. My question is quite general, I was wondering what's the optimal set of variants and allele-specific copy number that should be selected and fed to Canopy?

More specifically, do you need as input a few CNA-free SNAs, other SNAs overlapping with CNAs, and another few CNAs (just as in the breast cancer example in your paper) or is it unnecessary to have all 3 'types' in the set?

Also, it is unclear to me whether WM and Wm need to be provided for all SNAs? (even for cases where falcon segmentation algorithm did not call a CNA? because you might still have a ratio between WM and Wm slightly different to 1:1 at this SNA position).

Again thanks again for your contributions and kind regards,

Vince.

Yuchao Jiang

unread,

Sep 28, 2016, 9:58:26 PM9/28/16

to Vince R, canopy_phylogeny, Zhang, Nancy R

Dear Vince,

Thanks very much in your encouraging feedback! You have a very good question in that how to generate a clean set of input to Canopy is non-trivial. While this is not the focus of our paper, an input with hight false discovery rate will only lead to garbage in garbage out by Canopy. We are currently working on automating the pipeline for both CNAs and SNAs as well as offering guidance to select the informative CNAs. By saying informative, we mean that the SNAs or CNAs show distinct patterns between different samples (from the same patient since we are looking at intratumor heterogeneity). For SNAs, this means that the observed VAFs are different (see Figure 4B in our paper) and in this case a heatmap is a good way for visualization. For CNAs, this means that the WM and Wm are different (see Supplementary Figure S13 in our paper) and IGV is a good tool for visualization.

With this said, you don’t have to have all three types in the mutation input. If you don’t have CNAs, you can feed in Canopy all SNAs. Similarly you can use CNAs only, or a combination of these.

WM and Wm are both for CNAs. The Y matrix specifies whether an SNA lies within a CNA. For SNAs, Canopy doesn’t need a major and minor copy for SNA (which is required by PhyloWGS). For SNAs that are CNA-free, it is possible that the copy number ratio can be different from 1:1 but this might very well be due to false calls by CNA calling software and Canopy doesn’t aim at adjusting for upstream calls.

Hope that this clarifies. I am cc’ing my advisor Nancy here. Nancy, please feel free to add in here.

Cheers,

Yuchao

--
You received this message because you are subscribed to the Google Groups "canopy_phylogeny" group.
To unsubscribe from this group and stop receiving emails from it, send an email to canopy_phyloge...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/canopy_phylogeny/e275b847-3217-4a1c-926f-622ca7709520%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ame...@b612.email

unread,

Nov 20, 2018, 4:52:34 AM11/20/18

to canopy_phylogeny

Hi Yuchao

If an SNA does not lie in CNA, should I write it in Y matrix (the row is all 0)?

Sincerely,

Minfang

在 2016年9月29日星期四 UTC+8上午9:58:26，Yuchao Jiang写道：

Yuchao Jiang

unread,

Nov 20, 2018, 3:19:08 PM11/20/18

to ame...@b612.email, canopy_p...@googlegroups.com

This has been clearly stated in the user manual. I recommend go through it thoroughly.

To view this discussion on the web visit https://groups.google.com/d/msgid/canopy_phylogeny/662cc3f2-2eec-4efb-ad85-f68f515feb7c%40googlegroups.com.

Reply all

Reply to author

Forward