About TCGA Data

3,244 views
Skip to first unread message

Gort, Michele

unread,
Jul 13, 2016, 9:45:07 AM7/13/16
to cbiop...@googlegroups.com

To whom it may concern,

 

I am exporting the TCGA LUAD data from your website through the R program.

 

Upon receiving the forms, I noticed that in the term: “TOBACCO_SMOKING_HISTORY_INDICATOR” has only the values 1, 2, 3, 4; with some missing values.

 

The values specified in the clinical form with a number associated with it.  The associated numeric values for these permissible values are:

  • Lifelong Non-smoker (less than 100 cigarettes smoked in Lifetime) = 1
  • Current smoker (includes daily smokers and non-daily smokers or occasional smokers) = 2
  • Current reformed smoker for > 15 years (greater than 15 years) = 3
  • Current reformed smoker for ≤15 years (less than or equal to 15 years) = 4
  • Current reformed smoker, duration not specified = 5
  • Smoking History not documented = 7

 

I was told by the NCI/NIH to contact you regarding this to see if the data was changed from 1, 2, 3, 4, 5, 7 to sub-categories 1, 2, 3, 4 OR if the only variables in this dataset are truly 1-4 only. 

 

Also should the missing variables be a 7?!!?!?

 

Thanks for your help,

Michele

 

VAI-LogoWithTagRGB

-------------------------------------------------

Michele Gort | Intern

PSM Biostatistics/Bioinformatics Department

333 Bostwick Ave., N.E., Grand Rapids, Michigan 49503

 

Hsiao-Wei Chen

unread,
Jul 13, 2016, 2:55:00 PM7/13/16
to cbiop...@googlegroups.com, Michel...@vai.org
Hi Michele,

     The clinical data we use is from the supplementary data (Supplementary Table 7) of this TCGA publication at the link below. In their clinical data, Smoking Status (or we use "TOBACCO_SMOKING_HISTORY_INDICATOR" in cBioPortal) only has values from the first four (1-4) conditions plus [Not Available]. We didn't convert the data into any sub-categories. Please let me know if I didn't answer your question.  


Thanks,
Annice


--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ateo...@umich.edu

unread,
Sep 9, 2016, 1:49:19 PM9/9/16
to cBioPortal for Cancer Genomics Discussion Group, Michel...@vai.org
Hi I am trying to find exactly what the numbers for this variable mean. Where can I find this or what do these numbers mean?

Ritika Kundra

unread,
Sep 9, 2016, 5:17:03 PM9/9/16
to cbiop...@googlegroups.com, Michel...@vai.org
Hi Michele,

You can fine the specifics of the values from the NCI CDE Browser:

On the website under the attribute Patient Smoking History Category -> Value Domain tab you will find the table attached (screen shot)

Thanks,
Ritika


--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+unsubscribe@googlegroups.com.
To post to this group, send email to cbiop...@googlegroups.com.
Visit this group at https://groups.google.com/group/cbioportal.
Screen Shot 2016-09-09 at 5.11.12 PM.png
Reply all
Reply to author
Forward
0 new messages