[#10119]: Protein level by the Clinical Proteomic Tumor Analysis Consortium (CPTAC)

66 views
Skip to first unread message

Support

unread,
Jan 31, 2019, 10:55:56 AM1/31/19
to b.alm...@liverpool.ac.uk, cbiop...@googlegroups.com
Almarzouq Batool,

Thank you for contacting us. This is an automated response confirming the receipt of your ticket. Our team will get back to you as soon as possible. When replying, please make sure that the ticket ID is kept in the subject so that we can track your replies.

Ticket ID: 10119
Subject: Protein level by the Clinical Proteomic Tumor Analysis Consortium (CPTAC)
Department: GDC HelpDesk
Type: Issue
Status: Open
Priority: Normal


Kind regards,
Support


--
Please follow GDC twitter at @NCIGDC_Updates for the latest information regarding data, tools, new features, and events.

Almarzouq, Batool

unread,
Jan 31, 2019, 10:55:56 AM1/31/19
to cbiop...@googlegroups.com, sup...@nci-gdc.datacommons.io


Dear cbioportal team, 

I have a question regarding the z-score of the protein level by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) in the cbioportal. Although I went through the CPTAC publication and website, I’m still not sure. To make the question easier, I will give a specific example of an query I made. I specified Triple negative breast cancer from the TCGA database then I acquired about the protein expression of FANCD2. (Please see the attachments).

My question is how can we interpret the protein expression of FANCD2 in (TCGA-AR-A0U4-1) which is highlighted in the attached graph, please? Are we using 2 as the level that we compare against ( >2 increase in the expression, <2 decrease in the expression). In other words, do we say that mRNA expression is elevated ( >2 z-score) whereas the protein expression is decreased (<2 z-score) in sample TCGA-AR-A0U4-1. In general, can we say that we see decrease in FANCD2 protein expression in all of the samples. 

I’m trying to compare the cbioportal analysis to the analysis I did with the raw data that I downloaded from their website. I made my analysis using R but want to confirm it with the cbioportal.
I’m really sorry for the very long question.

Many Thanks,
Batool 


Pieter Lukasse

unread,
Feb 1, 2019, 9:30:29 AM2/1/19
to Almarzouq, Batool, cbiop...@googlegroups.com, sup...@nci-gdc.datacommons.io
Hi Batool,

it is a good question, thanks. In general, the z-score in cBioPortal will be calculated for the sample value based on the values distribution found for the same gene in a set of normal reference samples. When normal reference samples are not available, sometimes we fall back to other strategies. See here for more details: https://github.com/cBioPortal/cbioportal/blob/master/docs/Z-Score-normalization-script.md. See also all z-score related questions here: https://github.com/cBioPortal/cbioportal/blob/master/docs/FAQ.md

Maybe you can share the specific study you are querying (we have multiple versions of TCGA data) so that we can give you more details about what was done in this specific case?

Thanks,

Pieter Lukasse


E.   pie...@thehyve.nl

T.   +31(0)30 700 9713

W.  www.thehyve.nl


We empower scientists by building on open source software


Op do 31 jan. 2019 om 16:55 schreef Almarzouq, Batool <B.Alm...@liverpool.ac.uk>:
--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To post to this group, send email to cbiop...@googlegroups.com.
Visit this group at https://groups.google.com/group/cbioportal.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/3156A3F7-B2A8-4DF8-9846-A2C75A447BC4%40liverpool.ac.uk.
For more options, visit https://groups.google.com/d/optout.

Support

unread,
Feb 1, 2019, 3:45:20 PM2/1/19
to pie...@thehyve.nl, cbiop...@googlegroups.com, b.alm...@liverpool.ac.uk
Pieter Lukasse,

Thank you for contacting us. This is an automated response confirming the receipt of your ticket. Our team will get back to you as soon as possible. When replying, please make sure that the ticket ID is kept in the subject so that we can track your replies.

Ticket ID: 10132
Subject: Re: [cbioportal] Protein level by the Clinical Proteomic Tumor Analysis Consortium (CPTAC)

Almarzouq, Batool

unread,
Feb 5, 2019, 9:15:06 AM2/5/19
to Pieter Lukasse, cbiop...@googlegroups.com, sup...@nci-gdc.datacommons.io
Dear Pieter, 

Thank you so much for your reply, 

The study I meant is Breast Invasive Carcinoma (TCGA, Provisional). However, I selected only triple negative breast cancer patients (116 patients) by selecting the patients who are negative for HER2, ER and PR receptors from view summary tab. Please see the attachment for screenshots. As I mentioned earlier in my previous email, I’d be grateful if you can give more details about FANCD2 expression in the graphs I attached in my earlier email, please? Can you explain how the negative values were calculated what they represents? 

Many Thanks,
Batool 



On 1 Feb 2019, at 14:30, Pieter Lukasse <pie...@thehyve.nl> wrote:

Hi Batool,

it is a good question, thanks. In general, the z-score in cBioPortal will be calculated for the sample value based on the values distribution found for the same gene in a set of normal reference samples. When normal reference samples are not available, sometimes we fall back to other strategies. See here for more details: https://github.com/cBioPortal/cbioportal/blob/master/docs/Z-Score-normalization-script.md. See also all z-score related questions here: https://github.com/cBioPortal/cbioportal/blob/master/docs/FAQ.md

Maybe you can share the specific study you are querying (we have multiple versions of TCGA data) so that we can give you more details about what was done in this specific case?

Thanks,

Pieter Lukasse


E.   pie...@thehyve.nl

T.   +31(0)30 700 9713

W.  www.thehyve.nl


We empower scientists by building on open source software


Op do 31 jan. 2019 om 16:55 schreef Almarzouq, Batool <B.Alm...@liverpool.ac.uk>:


Dear cbioportal team, 

I have a question regarding the z-score of the protein level by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) in the cbioportal. Although I went through the CPTAC publication and website, I’m still not sure. To make the question easier, I will give a specific example of an query I made. I specified Triple negative breast cancer from the TCGA database then I acquired about the protein expression of FANCD2. (Please see the attachments).

My question is how can we interpret the protein expression of FANCD2 in (TCGA-AR-A0U4-1) which is highlighted in the attached graph, please? Are we using 2 as the level that we compare against ( >2 increase in the expression, <2 decrease in the expression). In other words, do we say that mRNA expression is elevated ( >2 z-score) whereas the protein expression is decreased (<2 z-score) in sample TCGA-AR-A0U4-1. In general, can we say that we see decrease in FANCD2 protein expression in all of the samples. 

I’m trying to compare the cbioportal analysis to the analysis I did with the raw data that I downloaded from their website. I made my analysis using R but want to confirm it with the cbioportal.
I’m really sorry for the very long question.

Many Thanks,
Batool 


<Screen Shot 2019-01-30 at 22.14.37.png>
<Screen Shot 2019-01-30 at 22.03.27.png>

Support

unread,
Feb 5, 2019, 9:33:52 AM2/5/19
to b.alm...@liverpool.ac.uk, pie...@thehyve.nl, cbiop...@googlegroups.com
Almarzouq Batool,

Thank you for contacting us. This is an automated response confirming the receipt of your ticket. Our team will get back to you as soon as possible. When replying, please make sure that the ticket ID is kept in the subject so that we can track your replies.

Ticket ID: 10189

Almarzouq, Batool

unread,
Feb 25, 2019, 9:45:30 AM2/25/19
to sup...@nci-gdc.datacommons.io, b.alm...@liverpool.ac.uk, pie...@thehyve.nl, cbiop...@googlegroups.com
Dear Sir, 

I’m sorry to email agin but I emailed earlier to ask about the z-score of the protein level by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) in the cbioportal. 

Although I went through the CPTAC publication and website and FAQs, I’m still not sure. To make the question easier, I will give a specific example of an query I made. I specified Triple negative breast cancer from the TCGA database then I acquired about the protein expression of FANCD2. (Please see the attachments). I understand that the z-score in cBioPortal will be calculated for the sample value based on the values distribution found for the same gene in a set of normal reference samples. However, why is the default value is 2 and not 1?

My question is how can we interpret the protein expression of FANCD2 in (TCGA-AR-A0U4-1) which is highlighted in the attached graph, please? 
Are we using 2 as the level that we compare against ( >2 increase in the expression, <2 or <-2 decrease in the expression). In other words, do we say that mRNA expression is elevated ( >2 z-score) whereas the protein expression is decreased (<2 z-score) in sample TCGA-AR-A0U4-1. In general, can we say that we see decrease in FANCD2 protein expression or no change in FANCD2 expression in all of the samples. 

The study I’m studying is Breast Invasive Carcinoma (TCGA, Provisional). However, I selected only triple negative breast cancer patients (116 patients) by selecting the patients who are negative for HER2, ER and PR receptors from view summary tab. Please see the attachment for screenshots. 

I’m trying to compare the cbioportal analysis to the analysis I did with the raw data that I downloaded from their website. I made my analysis using R but want to confirm it with the cbioportal.

Can I also kindly ask how can I access germline mutation in BRCA1/2 in Breast Invasive Carcinoma (TCGA, Provisional), please? In Bio-portal, I seem to get no germline mutation whereas the publications says otherwise? Or do I need to download level2 of the data which has special access?

I’m really sorry for the very long question.

Many Thanks,
Batool 





Many Thanks,
Batool 



JJ Gao

unread,
Feb 25, 2019, 10:57:08 AM2/25/19
to Almarzouq, Batool, pie...@thehyve.nl, cbiop...@googlegroups.com
bcc'd the GDC helpdesk as this question may not be relevant to GDC.

Hi Batool,

Bot the protein level and mRNA level z-scores were calculated comparing to the same cohort of tumor samples (ie. TCGA breast tumors in your case). A high/low z-score does not necessary mean irregular mRNA or protein expression in the tumor. It is not really meaningful to just analyze these values in one sample. However, it is useful to do correlation analysis, e.g. corelate mRNA with protein, or survival analysis (by setting a threshold to separate patients).

Germline mutations in BRCA1/2 were only included in the "Breast Invasive Carcinoma (TCGA, Nature 2012)" study.

Best,
-JJ

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To post to this group, send email to cbiop...@googlegroups.com.
Visit this group at https://groups.google.com/group/cbioportal.

Pieter Lukasse

unread,
Mar 19, 2019, 4:51:38 AM3/19/19
to Almarzouq, Batool, cbiop...@googlegroups.com, sup...@nci-gdc.datacommons.io
Hi Batool

sorry for the late reply. Thanks for the details. 

The negative values are negative z-scores. E.g. -2 z-score means that the expression for a gene in a sample is 2 standard deviation below the mean expression value for this same gene in the set of "normal" samples. The links I sent you in my previous email explains how the set of "normal" samples is determined when real normal samples are not available. I am not sure which type of normal samples were used for this study. This is something I need to check with the data curation team. I will ask them to check this and reply here. 

Best,

Pieter Lukasse


We empower scientists by building on open source software


Op di 5 feb. 2019 om 13:40 schreef Almarzouq, Batool <B.Almarzouq@liverpool.ac.uk>:
Reply all
Reply to author
Forward
0 new messages