Hi Laura,
Thank you for your question.
The discrepancy is primarily due to differences in the copy number analysis pipelines used by the two studies. TCGA Firehose Legacy uses GISTIC-based copy number calls, whereas TCGA GDC 2025 uses ASCAT-based processing. As a result, the same genomic event may be classified differently between the studies (for example, as a homozygous deletion in one dataset and a shallow deletion in another).
For most downstream analyses, we generally recommend using the PanCancer Atlas cohort when available, as it includes a broader range of data types and underwent additional manual review as part of the PanCancer Atlas project. It is also important to note that the GDC cohort is based on the hg38 reference genome, while PanCancer Atlas and most other cBioPortal studies use hg19.
To ensure consistency and avoid interpretation issues, we recommend selecting a single cohort and using it consistently for mutation, copy number, expression, and survival analyses.
I hope this helps clarify the differences. Please let us know if you have any further questions.
Best regards,
Anusha