Gene expression data ICGC / TCGA

90 views
Skip to first unread message

saras...@gmail.com

unread,
Jun 29, 2020, 12:42:54 PM6/29/20
to UCSC Xena and Cancer Genomics Browser
Hello,

I am currenty working on the ICGC OV-AU data set, more specifically on the gene expression data. I am using the normalized_read_count and I was wondering which would be the equivalent TCGA Ovarian Cancer gene expression set, as there are multiple [IlluminaHiSeq pancan normalized , IlluminaHiSeq percentile UNC , IlluminaHiSeq UNC].

Furthermore, as I am working with gene Ensembl IDs I would prefer if the TCGA data also had the IDs rather than the gene names.

On the browser there are 2 TCGA Ovarian Cancer datasets, one which is plain TCGA [https://xenabrowser.net/datapages/?cohort=TCGA%20Ovarian%20Cancer%20(OV)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443] and one that is GDC TCGA OV [https://xenabrowser.net/datapages/?cohort=GDC%20TCGA%20Ovarian%20Cancer%20(OV)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443], could you please also let me know of the difference between the two.

When I choose the GDC TCGA dataset, which one should I pick that would match the ICGC OV-AU gene expression normalization? The options are 'counts', 'FPKM', 'FPKM-UQ' .

Thank you very much in advance!

Kind regards,

Sara

Mary Goldman

unread,
Jun 29, 2020, 1:14:10 PM6/29/20
to saras...@gmail.com, UCSC Xena and Cancer Genomics Browser
Hi Sara,

Please see my replies inline below.

Best,
Mary
-----
Mary Goldman, Design and Outreach Engineer
Revealing life's code


---------- Forwarded message ---------
From: <saras...@gmail.com>
Date: Mon, Jun 29, 2020 at 9:42 AM
Subject: [ucsc-cancer-genomics-browser] Gene expression data ICGC / TCGA
To: UCSC Xena and Cancer Genomics Browser <ucsc-cancer-ge...@googlegroups.com>


Hello,

I am currenty working on the ICGC OV-AU data set, more specifically on the gene expression data. I am using the normalized_read_count and I was wondering which would be the equivalent TCGA Ovarian Cancer gene expression set, as there are multiple [IlluminaHiSeq pancan normalized , IlluminaHiSeq percentile UNC , IlluminaHiSeq UNC].

--> What do you mean by equivalent? Are you hoping to combine the datasets or perhaps something else?

Furthermore, as I am working with gene Ensembl IDs I would prefer if the TCGA data also had the IDs rather than the gene names.

--> We offer a mapping file between Ensembl IDs and Hugo gene names here: https://gdc.xenahubs.net/download/gencode.v22.annotation.gene.probeMap

On the browser there are 2 TCGA Ovarian Cancer datasets, one which is plain TCGA [https://xenabrowser.net/datapages/?cohort=TCGA%20Ovarian%20Cancer%20(OV)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443] and one that is GDC TCGA OV [https://xenabrowser.net/datapages/?cohort=GDC%20TCGA%20Ovarian%20Cancer%20(OV)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443], could you please also let me know of the difference between the two.

--> The plain TCGA dataset is the legacy TCGA data that came out with the original TCGA publication. The GDC then took the original TCGA data and reprocessed it using their harmonized pipelines (https://gdc.cancer.gov/about-data/data-processing/genomic-data-processing). This is what the GDC dataset it.

When I choose the GDC TCGA dataset, which one should I pick that would match the ICGC OV-AU gene expression normalization? The options are 'counts', 'FPKM', 'FPKM-UQ' .

--> I'm sorry but I am again not sure what you mean by match. Please let me know.

Thank you very much in advance!

Kind regards,

Sara

--
You received this message because you are subscribed to the Google Groups "UCSC Xena and Cancer Genomics Browser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ucsc-cancer-genomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ucsc-cancer-genomics-browser/9996f626-33c0-406f-9eea-488dac748488o%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages