Questions about TCGA Provisional vs. PanCancer Atlas

421 views
Skip to first unread message

Oba,Junna

unread,
Jul 26, 2018, 9:12:46 PM7/26/18
to cbiop...@googlegroups.com

Hello,

 

I would like to ask questions about mRNA expression TCGA data, Provisional vs. PanCancer Atlas.

 

When I click PanCancer Atlas for a study, in the “Select Patient/Case Set” panel and look for “tumors with mRNA data”, we tend to see the mRNA data from H133 microarray (green highlight), and when I click Provisional for a study, we tend to see mRNA-seq (V2) data. Some of provisional studies have not only RNA-seq V2 but also Agilent microarray and/or H133 microarray set. Some cancer types didn't have options to select “Tumors with mRNA data, for example, for Kidney Renal Papillary Cell Carcinoma (both Provisional and PanCancer Atlas).

Also, comparing the number of sample for each cancer type, there are some discrepancies between Provisional vs. PanCancer Atlas.

 

My questions are:

1) Are Provisional vs. PanCancer Atlas coming from the same or overlapping samples?

2) In general, TCGA Provisional has RNA-seq, and TCGA PanCancer Atlas microarray mRNA expression data? If so, were these analyses done separately?

3) For RNA-seq data, can we look at the actual RSEM or FPKM levels, and not the z-score?

 

I have pasted below what I have seen, as well as attached the excel spreadsheet to compare the sample number for each cancer type between Provisional vs. PanCancer Atlas.

I would really appreciate your answers and suggestions.

 

Thank you,

 

Junna Oba, MD, PhD

JO...@mdanderson.org

 

 

 

----------------------------------------------------------------------------------------------------------------------------------------------------------------

Abbreviation

Study Name

cBioPortal PanCancer

cBioPortal provisional

Note

ACC

Adrenocortical Carcinoma

78

79

BLCA

Bladder Urothelial Carcinoma

407

408

BRCA

Breast Invasive Carcinoma

1081

1100

provisional has also mRNA data from n=529 for Agilent microarray

CESC

Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma

293

306

CHOL

Cholangiocarcinoma

36

36

COAD

Colon Adenocarcinoma

276

NA

Colorectal adenocarcinoma, COREAD, provisional: 222 for Agilent microarray, 382 for RNA seq V2)

ESCA

Esophageal Carcinoma

181

185

GBM

Glioblastoma Multiforme

160

166

provisional has also mRNA data from n=401 for Agilent microarray and n=528 for U133 microarray

HNSC

Head and Neck Squamous Cell Carcinoma

515

522

provisional has also mRNA data from n=72 for Agilent microarray

KICH

Kidney Chromophobe

65

66

KIRC

Kidney Renal Clear Cell Carcinoma

510

534

KIRP

Kidney Renal Papillary Cell Carcinoma

NA

NA

LGG

Brain Lower Grade Glioma

514

530

provisional has also mRNA data from n=27 for Agilent microarray

LIHC

Liver Hepatocellular Carcinoma

366

373

LUAD

Lung Adenocarcinoma

510

517

provisional has also mRNA data from n=32 for Agilent microarray

LUSC

Lung Squamous Cell Carcinoma

484

NA

MESO

Mesothelioma

87

87

OV

Ovarian serous Cystadenocarcinoma

299

307

provisional has also mRNA data from n=558 for Agilent microarray and n=535 for U133 microarray

PAAD

Pancreatic Adenocarcinoma

177

179

PCPG

Pheochromocytoma and Paraganglioma

NA

184

PRAD

Prostate Adenocarcinoma

493

498

READ

Rectum Adenocarcinoma

89

NA

Colorectal adenocarcinoma, COREAD, provisional: 222 for Agilent microarray, 382 for RNA seq V2)

SARC

Sarcoma

NA

NA

SKCM

Skin Cutaneous Melanoma

443

472

STAD

Stomach Adenocarcinoma

401

415

provisional has also mRNA data from n=36 for RNA-seq (not V2)

TGCT

Testicular Germ Cell Cancer

NA

NA

THCA

Thyroid Carcinoma

NA

NA

THYM

Thymoma

119

120

UCEC

Uterine Corpus Endometrial Carcinoma

173

177

provisional has also mRNA data from n=54 for Agilent microarray

UCS

Uterine Carcinosarcoma

57

57

UVM

Uveal Melanoma

80

80

Tumor Samples with mRNA data (U133 microarry only)

Tumor Samples with mRNA data (RNA seq V2)

 

 

 

Breast Invasive Carcinoma (TCGA, Provisional) has “Tumor Samples with mRNA data (Agilent microarray) (529)” and “Tumor Samples with mRNA data (RNA-seq V2) (1100)”

 

Glioblastoma Multiforme (TCGA, Provisional) has “Tumor Samples with mRNA data (Agilent microarray) (401)”, “Tumor Samples with mRNA data (RNA-seq V2) (166)”, and “Tumor Samples with mRNA data (U133 microarray) (528)”.

 

Uveal Melanoma (TCGA, PanCnacer Atlas) has “Tumor Samples with mRNA data (U133 microarray only) (80)”, and

 

Uveal Melanoma (TCGA, Provisional) has “Tumor Samples with mRNA data (RNA-seq V2) (80)”

 

Kidney Renal Papillary Cell Carcinoma (TCGA, Provisional) (TCGA, PanCancer Atlas), in both setting, did not have “Select Patient/Case Set” panel to choose “tumors with mRNA data”.

 

The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems.

mRNA_expression_data_for_TCGA_cBioportal.xlsx

Kelsey Zhu

unread,
Jul 26, 2018, 11:46:49 PM7/26/18
to Oba,Junna, cbiop...@googlegroups.com
Hi Junna, 

Provisional and PanCancer Atlas share some samples. Analyses were done separately. 

For RNA-seq data, you can look at the actual RSEM level by clicking on "Plots" tab on query result view, and then select "mRNA Expression (RNA Seq V2 RSEM)" from the Profile Name dropdown box(please see attached).


Best!

kelsey


--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+unsubscribe@googlegroups.com.
To post to this group, send email to cbiop...@googlegroups.com.
Visit this group at https://groups.google.com/group/cbioportal.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/F84D8832-43E5-4417-AA07-7A826B5C3E8C%40mdanderson.org.
For more options, visit https://groups.google.com/d/optout.

Screen Shot 2018-07-26 at 11.34.03 PM.png

JJ Gao

unread,
Jul 29, 2018, 12:42:27 PM7/29/18
to JO...@mdanderson.org, cBioPortal for Cancer Genomics Discussion Group, Kelsey Zhu
Hi Junna,

Thank you for contacting us and providing the detailed comparison between pancan and provisional.

Provisional and Pancan expression data have a very large overlap in terms of samples and data itself. The provisional datasets in cBioPortal were retrieved from Firehose run of January 28, 2016 (https://gdac.broadinstitute.org/). The Pancan study were compiled for the Pancan analysis publised in cell (https://www.cell.com/pb-assets/consortium/pancanceratlas/pancani3/index.html). Their underlying data and samples were from the same TCGA source but data processing might be different.

The Pancan data has both microarray data and RNAseq v2 data while the Provisional data may miss one or both in certain cancer types. We recommend using the pancan data for TCGA data analysis.

You can use the DOWNLOAD DATA query form to download the RSEM data. You can also use Plots tab to look at the original data as Kelsey has pointed out.

image.png

I hope this is helpful. Please feel free to contact us for additional questions.

Best,
-JJ


To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.

To post to this group, send email to cbiop...@googlegroups.com.
Visit this group at https://groups.google.com/group/cbioportal.

Oba,Junna

unread,
Jul 30, 2018, 5:24:54 AM7/30/18
to JJ Gao, Kelsey Zhu, cBioPortal for Cancer Genomics Discussion Group

Hi JJ and Kelsey,

 

Thank you for your response and advice.

 

I am sorry if I am asking a silly question, but am I correct to understand that you suggest to go to the PanCancer data from GDC, in this page (https://www.cell.com/pb-assets/consortium/pancanceratlas/pancani3/index.html), and not cBioPortal itself, because currently cBioPortal does not show “RNA-seq V2” but “H133 microarray only” when I select “PanCancer Atlas”, then ““Select Patient/Case Set”?

 

For example, Uveal Melanoma (TCGA, PanCnacer Atlas) has “Tumor Samples with mRNA data (U133 microarray only) (80)”. And it is the case for all the other cancer types when I select “TCGA, PanCancer Atlas”.

 

 

Thank you very much for your help and support.

 

Sincerely,

 

Junna Oba

cid:image001.png@01D42502.58D2F690

 

Glioblastoma Multiforme (TCGA, Provisional) has “Tumor Samples with mRNA data (Agilent microarray) (401)”, “Tumor Samples with mRNA data (RNA-seq V2) (166)”, and “Tumor Samples with mRNA data (U133 microarray) (528)”.

cid:image002.png@01D42502.58D2F690

 

Uveal Melanoma (TCGA, PanCnacer Atlas) has “Tumor Samples with mRNA data (U133 microarray only) (80)”, and cid:image003.png@01D42502.58D2F690

 

Uveal Melanoma (TCGA, Provisional) has “Tumor Samples with mRNA data (RNA-seq V2) (80)”

cid:image004.png@01D42502.58D2F690

 

Kidney Renal Papillary Cell Carcinoma (TCGA, Provisional) (TCGA, PanCancer Atlas), in both setting, did not have “Select Patient/Case Set” panel to choose “tumors with mRNA data”.

cid:image005.png@01D42502.58D2F690

 

The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems.

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
cbioportal+...@googlegroups.com.
To post to this group, send email to
cbiop...@googlegroups.com.
Visit this group at
https://groups.google.com/group/cbioportal.
To view this discussion on the web visit
https://groups.google.com/d/msgid/cbioportal/F84D8832-43E5-4417-AA07-7A826B5C3E8C%40mdanderson.org.
For more options, visit
https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
cbioportal+...@googlegroups.com.
To post to this group, send email to
cbiop...@googlegroups.com.
Visit this group at
https://groups.google.com/group/cbioportal.
To view this discussion on the web visit
https://groups.google.com/d/msgid/cbioportal/CAJiiRG0tRY8C%3DCQ41FMYTnUR16ON%2BuM4o8SqmGoZXEACH2EO6g%40mail.gmail.com.
For more options, visit
https://groups.google.com/d/optout.

JJ Gao

unread,
Jul 30, 2018, 10:57:32 AM7/30/18
to JO...@mdanderson.org, Kelsey Zhu, cBioPortal for Cancer Genomics Discussion Group
Hi Junna,

For TCGA PanCancer studies, you can select "All tumors" when querying. There is a bug that keeps us from generating the proper case list for RNASeq v2 data (we will fix that soon). But you can query the data by selecting the RNASeqV2 genetic profile in the "Select Genome Profile" section.

Thanks,
-JJ

JJ Gao

unread,
Jul 30, 2018, 11:29:30 AM7/30/18
to JO...@mdanderson.org, Kelsey Zhu, cBioPortal for Cancer Genomics Discussion Group
You are very welcome. Thanks for contacting us and using cBioPortal. -jj

On Mon, Jul 30, 2018 at 11:18 AM Oba,Junna <JO...@mdanderson.org> wrote:
>
> Hi JJ,
>
>
>
> Thank you for your quick response and answer.
>
> This has cleared my question, and I understand now.
>
>
>
> Thank you again for your help and support.
>
>
>
> Sincerely,
>
>
>
> Junna Oba
>
>
>
> From: JJ Gao [mailto:jianji...@gmail.com]
> Sent: Monday, July 30, 2018 9:57 AM
> To: Oba,Junna <JO...@mdanderson.org>
> Cc: Kelsey Zhu <kelse...@gmail.com>; cBioPortal for Cancer Genomics Discussion Group <cbiop...@googlegroups.com>
> Subject: Re: [cbioportal] Questions about TCGA Provisional vs. PanCancer Atlas
>
>
>
> Hi Junna,
>
>
>
> For TCGA PanCancer studies, you can select "All tumors" when querying. There is a bug that keeps us from generating the proper case list for RNASeq v2 data (we will fix that soon). But you can query the data by selecting the RNASeqV2 genetic profile in the "Select Genome Profile" section.
>
>
>
> Thanks,
>
> -JJ
>
>
>
> On Mon, Jul 30, 2018 at 12:24 AM Oba,Junna <JO...@mdanderson.org> wrote:
>
> Hi JJ and Kelsey,
>
>
>
> Thank you for your response and advice.
>
>
>
> I am sorry if I am asking a silly question, but am I correct to understand that you suggest to go to the PanCancer data from GDC, in this page (https://www.cell.com/pb-assets/consortium/pancanceratlas/pancani3/index.html), and not cBioPortal itself, because currently cBioPortal does not show “RNA-seq V2” but “H133 microarray only” when I select “PanCancer Atlas”, then ““Select Patient/Case Set”?
>
>
>
> For example, Uveal Melanoma (TCGA, PanCnacer Atlas) has “Tumor Samples with mRNA data (U133 microarray only) (80)”. And it is the case for all the other cancer types when I select “TCGA, PanCancer Atlas”.
>
>
>
>
>
> Thank you very much for your help and support.
>
>
>
> Sincerely,
>
>
>
> Junna Oba
>
>
>
>
>
>
>
> From: JJ Gao <jianji...@gmail.com>
> Date: Sunday, July 29, 2018 at 11:52 AM
> To: "Oba,Junna" <JO...@mdanderson.org>
> Cc: cBioPortal for Cancer Genomics Discussion Group <cbiop...@googlegroups.com>, Kelsey Zhu <kelse...@gmail.com>
> Subject: Re: [cbioportal] Questions about TCGA Provisional vs. PanCancer Atlas
>
>
>
> Hi Junna,
>
>
>
> Thank you for contacting us and providing the detailed comparison between pancan and provisional.
>
>
>
> Provisional and Pancan expression data have a very large overlap in terms of samples and data itself. The provisional datasets in cBioPortal were retrieved from Firehose run of January 28, 2016 (https://gdac.broadinstitute.org/). The Pancan study were compiled for the Pancan analysis publised in cell (https://www.cell.com/pb-assets/consortium/pancanceratlas/pancani3/index.html). Their underlying data and samples were from the same TCGA source but data processing might be different.
>
>
>
> The Pancan data has both microarray data and RNAseq v2 data while the Provisional data may miss one or both in certain cancer types. We recommend using the pancan data for TCGA data analysis.
>
>
>
> You can use the DOWNLOAD DATA query form to download the RSEM data. You can also use Plots tab to look at the original data as Kelsey has pointed out.
>
>
>
>
>
> Glioblastoma Multiforme (TCGA, Provisional) has “Tumor Samples with mRNA data (Agilent microarray) (401)”, “Tumor Samples with mRNA data (RNA-seq V2) (166)”, and “Tumor Samples with mRNA data (U133 microarray) (528)”.
>
>
>
> Uveal Melanoma (TCGA, PanCnacer Atlas) has “Tumor Samples with mRNA data (U133 microarray only) (80)”, and
>
>
>
> Uveal Melanoma (TCGA, Provisional) has “Tumor Samples with mRNA data (RNA-seq V2) (80)”
>
>
>
> Kidney Renal Papillary Cell Carcinoma (TCGA, Provisional) (TCGA, PanCancer Atlas), in both setting, did not have “Select Patient/Case Set” panel to choose “tumors with mRNA data”.
>
>
>

Oba,Junna

unread,
Jul 30, 2018, 12:29:23 PM7/30/18
to JJ Gao, Kelsey Zhu, cBioPortal for Cancer Genomics Discussion Group

Hi JJ,

 

Thank you for your quick response and answer.

This has cleared my question, and I understand now.

 

Thank you again for your help and support.

 

Sincerely,

 

Junna Oba

 

I hope this is helpful. Please feel free to contact us for additional questions.

 

Best,

-JJ

 

 

Glioblastoma Multiforme (TCGA, Provisional) has “Tumor Samples with mRNA data (Agilent microarray) (401)”, “Tumor Samples with mRNA data (RNA-seq V2) (166)”, and “Tumor Samples with mRNA data (U133 microarray) (528)”.

 

Uveal Melanoma (TCGA, PanCnacer Atlas) has “Tumor Samples with mRNA data (U133 microarray only) (80)”, and

 

Uveal Melanoma (TCGA, Provisional) has “Tumor Samples with mRNA data (RNA-seq V2) (80)”

 

Kidney Renal Papillary Cell Carcinoma (TCGA, Provisional) (TCGA, PanCancer Atlas), in both setting, did not have “Select Patient/Case Set” panel to choose “tumors with mRNA data”.

 

The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems.

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
cbioportal+...@googlegroups.com.
To post to this group, send email to
cbiop...@googlegroups.com.
Visit this group at
https://groups.google.com/group/cbioportal.
To view this discussion on the web visit
https://groups.google.com/d/msgid/cbioportal/F84D8832-43E5-4417-AA07-7A826B5C3E8C%40mdanderson.org.
For more options, visit
https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To post to this group, send email to cbiop...@googlegroups.com.
Visit this group at https://groups.google.com/group/cbioportal.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/CAJiiRG0tRY8C%3DCQ41FMYTnUR16ON%2BuM4o8SqmGoZXEACH2EO6g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems.

Reply all
Reply to author
Forward
0 new messages