Confusion For cbioportal

180 views
Skip to first unread message

刘博(Bo Liu)

unread,
Aug 13, 2019, 1:07:32 PM8/13/19
to cbiop...@googlegroups.com

Hello, Dear ;

       Recently , I’m using the cbioportal , it’s really helpful ,and there some questionsI couldn’t found the explaintion.

1.     What the value in the CN Segment mean of each segment actually means .It’s actually CopyNum or the CopyNum deal with log, or other means ;

 

2.     What the “Fraction of copy number altered genome” mean in Summary Page.

 

Best wishes


Ben

 

Robert Sheridan

unread,
Aug 14, 2019, 5:06:56 PM8/14/19
to cBioPortal for Cancer Genomics Discussion Group
Hello Ben.

I will try to give you some answers.

1. The values that are shown in the IGV viewer come from data provided in .seg file format (https://software.broadinstitute.org/software/igv/SEG) which allows a numerical value for each referenced genomic segment. The numeric value is computed by some segmented copy number calling algorithm. We have used different algorithms over time, so different studies may have been computed in different ways, but the basic guidelines for interpreting the .seg numeric values are:
- segments which are computed to have unaltered (diploid) copy number have numeric values equal to zero
- segments which show deletion of alleles at a locus have negative numeric values
- segments which show amplification of alleles at a locus have positive numeric values

These numeric values might come from the seg.mean field in the output from our Roslin analysis pipeline for example. See : https://github.com/mskcc/roslin-variant/wiki/Roslin-Output-v2.4.1 I am not an expert on this code (maybe others can comment), but there is a link to the code base : https://github.com/mskcc/facets-suite and in that code you can see a computation here: https://github.com/mskcc/facets-suite/blob/master/facets_qc.R

center_igv_file <- function(outfile){
  
  igv.adj <- out$IGV
  
  if(out$dipLogR <= 0){igv.adj$seg.mean = igv.adj$seg.mean + abs(out$dipLogR)}
  if(out$dipLogR > 0){igv.adj$seg.mean = igv.adj$seg.mean - out$dipLogR}
  
  write.table(igv.adj, file = outfile, quote = F, row.names = F, col.names = T,
              sep = "\t")
  
}

Based on a quick look at this code, it appears to me that there has been a computation to determine the diploid LogR (Log ratio?) copy number level and that number is subtracted from the computed mean copy number level for the segment.

The diploid LogR appears to be determined through a curve fitting function (see https://github.com/mskcc/facets-suite/blob/master/funcs.R)

Here is a reference to the FACETS publication : https://www.ncbi.nlm.nih.gov/pubmed/30393068

2. "Fraction of copy number altered genome" measures what fraction of the genome differs significantly from diploid. When studies are imported, this value is computed by totaling the length of all copy number segments for a sample which differ from seg.mean of zero by at least than 0.2 (seg.mean <= -0.2 OR seg.mean >= 0.2). The length of all such altered segments is divided by the length of every recorded segment to compute the fraction. See code here: https://github.com/cBioPortal/cbioportal/blob/master/core/src/main/java/org/mskcc/cbio/portal/dao/DaoCopyNumberSegment.java

On Tuesday, August 13, 2019 at 1:07:32 PM UTC-4, 刘博(Bo Liu) wrote:

Hello, Dear ;

       Recently , I’m using the cbioportal , it’s really helpful ,and there some questionsI couldn’t found the explaintion.

1.     What the value in the CN Segment mean of each segment actually means .It’s actually CopyNum or the CopyNum deal with log, or other means ;

 

2.     What the “Fraction of copy number altered genome” mean in Summary Page.

 

Best wishes


Ben

 

Reply all
Reply to author
Forward
0 new messages