Dear Yanyan,
Here is an explanation on how the thresholds are set:
The table in all_thresholded.by_genes.txt (which is the part of the GISTIC output that is used to determine the copy-number status of each gene in each sample in cBioPortal) is obtained by applying both low- and high-level thresholds to to the gene copy levels of all the samples. The entries with value +/- 2 exceed the high-level thresholds for amps/dels, and those with +/- 1 exceed the low-level thresholds but not the high-level thresholds. The low-level thresholds are just the 'amp_thresh' and 'del_thresh' noise threshold input values to GISTIC (typically 0.1 or 0.3) and are the same for every thresholds.
By contrast, the high-level thresholds are calculated on a sample-by-sample basis and are based on the maximum (or minimum) median arm-level amplification (or deletion) copy number found in the sample. The idea, for deletions anyway, is that this level is a good approximation for hemizygous given the purity and ploidy of the sample. The actual cutoffs used for each sample can be found in a table in the output file sample_cutoffs.txt. All GISTIC output files for TCGA are available at:
If you want to learn more about GISTIC please refer to :
Mermel C, Schumacher S, et al. (2011). "GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers." Genome Biology,12:R41.
If you have any further questions, you can write to me directly.
Regards,
Andrew
___________________________
Andrew Cherniack, PhD
Research Scientist II
Broad Institute
7 Cambridge Center
Cambridge, Mass 02142
email: ache...@broadinstitute.org___________________________
Dear Sirs and Madams,
I am a student majoring in bioinformatics in Harbin Medical University of China. Recently, I have read your paper "The cBio Cancer Genomics Portal: An Open Platform for Exploring Multidimensional Cancer Genomics Data", and I went to
http://www.cbioportal.org/public-portal/index.do to view it. It is a useful tool for me. I have some questions: 1. The web provides the results about the "Putative copy-number alterations from GISTIC", I want to know which data these results is derived from, the aCGH or SNP 6.0? 2. I know that the GISTIC can calculated the significant copy nubmer alteration across samples. But how to determined the altered status of genes in each samples, -2,-1,0,1,2, what is the threshold? Whether these
thresholds were applied to the means values of segments? I have carefully read the FAQ, but I still don't find the answer.
I am looking forward for your reply.
Best wishes.
Sincerly Yours,
Yanyan Ping