Examples of major cn in pyclone-IV

451 views
Skip to first unread message

Zhuo Xm

unread,
Mar 17, 2021, 1:33:06 PM3/17/21
to Pyclone User Group
Dear Dr. Roth,

I am try to generate the tsv input files for PyClone. Because the example tsv in pyclone github seems do not have CNV so I borrow the example tsv from PyClone VI (https://zenodo.org/record/4268826#.YFIS1ZNKhTY). (I assume the definition is the same in PyClone and PyClone VI)

I am a little confuse about the major_cn in the examples of PyClone VI 

mutation_id sample_id ref_counts alt_counts normal_cn major_cn minor_cn tumour_content
CRUK0100:1:1153814:G R1 260 128 2 4 0 0.68
CRUK0100:1:3782573:G R1 82 2 2 4 0 0.68
CRUK0100:3:126227155:G R1 124 32 2 3 0 0.68


My perception is the major_cn and minor_cn value is derive from CNV data and intersect it with the SNV data. The major_cn refer to the cn of major allele (most of time is the ref), and the minor_cn refer to the cn of the minor allele (most of time is the alt).

The CRUK0100:1:3782573:G is easy to understand, this loci has duplication with 4 copy and majority of it is ref, so major_cn is 4. 

But the CRUK0100:1:1153814:G is confusing, the loci still has duplication with 4 copy, but the ref:alt ratio seem closer to 2:1, how to get the major_cn 4 and minor_cn 0.

Please correct me.

best wishes

Ming



Andrew

unread,
Mar 17, 2021, 4:56:20 PM3/17/21
to Pyclone User Group
Hi Ming,

The CN values are not related to the copy number of the mutation directly. They are inferred from allele specific copy number tools which consider SNPs. This corresponds to copy number changes of the maternal/paternal inherited chromosomes. If we had perfect information they would maternal/paternal copy numbers. However, we cannot typically infer this so we just say the bigger of the two values is the major and the smaller is the minor. The number of mutated copies of an SNV does not need to correspond to either of these values, depending on when the mutation happened relative to the copy number events. PyClone handles this by marginalizing (summing) over all possible scenarios compatible with the input CN values.

A simple example. Let M denote the maternal and P the paternal chromosomes.

1. For an autosome in a non-malignant cell  the CNV genotype is  M/P i.e. one maternal and one paternal.  This is major_cn=1 and minor_cn=1.
2. Now we could duplicate the maternal chromosome and have the genotype MM/P. This is major_cn=2 and minor_cn=1.
3. An SNV could occur on one of the maternal alleles but the CNV genotypes is still MM/P. Thus major_cn=2 and minor_cn=1 still.

Cheers,
Andy

Zhuo Xm

unread,
Mar 17, 2021, 10:12:32 PM3/17/21
to pyclone-u...@googlegroups.com
Hey Dr. Andy,

Thanks for your speedy reply. This is very helpful. 

In some cnv calling pipeline, they also have output of heterozygosity level for estimating a lost of heterozygosity in a cnv segment. Can this be used for the estimate the major_cn.
For example, 
A normal segment in autosomal chromosome has 2 copy and the heterozygosity is around 50%. So every loci in this segment have major_cn 1 and minor_cn 1.
A segment has 3 copies, and it's heterozygosity is around 30%, every loci in this segment have 
major_cn 2 and minor_cn 1.
A segment has 1 copies, and it's heterozygosity is around 5%, every loci in this segment have 
major_cn 1 and minor_cn 0.

Best

Ming





--
You received this message because you are subscribed to a topic in the Google Groups "Pyclone User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pyclone-user-group/nYg8nvt0XxY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pyclone-user-gr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pyclone-user-group/f4e7e5a5-4527-4405-824c-2ae8d89754e0n%40googlegroups.com.

Andrew

unread,
Apr 14, 2021, 12:27:21 PM4/14/21
to Pyclone User Group
That might be a bit noisy, but it should work.
Reply all
Reply to author
Forward
0 new messages