Running Pyclone on panel-seq data with few mutations

288 views
Skip to first unread message

11033...@qq.com

unread,
Feb 19, 2021, 3:56:44 AM2/19/21
to Pyclone User Group
Hi Andrew,
I want to use Pyclone to analyze my sequencing data from FoundationOne CDx panel (~300 genes, most are cancer genes, coverage ~1000X). The range of number of mutations per sample is 4-50 (median=12). I know the mutation number is a bit of small , but I really want to explore ITH in my data. Do you think its reasonable to apply Pyclone on my data? And I wonder if there is any suggestion on analyzing such panel-seq data with few mutations?
Thanks!

Yang

Andrew

unread,
Feb 19, 2021, 11:46:31 AM2/19/21
to Pyclone User Group
Hi Yang,

That shouldn't be a problem. PyClone was originally designed with this setup in mind i.e. few mutations and reasonably high depth. A few challenges you may face though:

1. Getting copy number calls from panel seq data can be difficult
2. Similarly getting an estimate of tumour purity. You could use pathologists estimates but typically the estimates come as a biproduct of CNV calling.
3. Using a single sample from a patient does not generate incredibly accurate results. This is not a PyClone issue per say, just an challenge of doing clonal inference from bulk.

Point 1 is probably the biggest issue. Without CNV data you essentially end up clustering on VAF which is sub-optimal. It is a long shot, but if you happen to also have full coverage scRNA data for these samples we put a new approach which looks promising in the sinlge sample low mutation regime https://www.biorxiv.org/content/10.1101/2021.02.16.431009v1.

Best wishes,
Andy

11033...@qq.com

unread,
Feb 19, 2021, 8:33:52 PM2/19/21
to Pyclone User Group
Got it! Thanks for your kind advice!

Yang

11033...@qq.com

unread,
Feb 22, 2021, 3:43:33 AM2/22/21
to Pyclone User Group
Hi Andy,
So sorry to bother you again, I have another problem. I got the purity and total_cn of my panel-seq data from CNVkit+PureCN pipeline. The result seems reasonable. However, when I tried to get minor_cn using VAF of germline SNPs(given by GATK haplotyper), I found the VAF of most SNPs are not around 50%, but are smaller than 20%. As a result, the calculated minor_cn is always 0 (e.g. total_cn=2, minor_cn=0; total_cn=8, minor_cn=0). I think this is strange but have no idea on how to solve this problem.
So I wonder if I could only give Pyclone purity and total_cn, without minor_cn and major_cn. I think it should be better than using VAF only, and I can also avoid using the strange minor_cn result.
Or, if minor_cn and major_cn is mandatory for Pyclone, is it OK to use a fake minor_cn? For instance, if total_cn = 1, then set the minor_cn to 0, if total_cn = 2, then set the minor_cn to 1, ......, i.e., the minor_cn is always equal to round(total_cn / 2).

Thanks!

Yang

Andrew

unread,
Feb 22, 2021, 9:11:10 AM2/22/21
to Pyclone User Group
Hi Yang,

Total copy number is fine. There is a flag called "--prior" and the argument "total_copy_number" will do what you want. In the input file set major_cn to the total and minor_cn to 0. The only downsides are the posteriors will have bigger std for the cellular prevalence and you may get fewer clusters than truth.

Cheers,
Andy


11033...@qq.com

unread,
Feb 22, 2021, 9:47:41 PM2/22/21
to Pyclone User Group
Andy, thanks a lot for your quick reply. I'll try it!
Reply all
Reply to author
Forward
0 new messages