FAQ: how do we curate TCGA survival information, which is used in KM plot

3,439 views
Skip to first unread message

Jing Zhu

unread,
Jul 11, 2013, 3:31:24 AM7/11/13
to ucsc-cancer-ge...@googlegroups.com
UCSC Cancer Browser team curate the overall survival (OS) and recurrence free survival (RFS) information from the TCGA clinical and phenotypic data.   

Overall Survival (OS) The event call is derived from "vital status" parameter. The time_to_event is in days, equals to days_to_death if patient deceased; in the case of a patient is still living, the time variable is the maximum(days_to_last_known_alive, days_to_last_followup).  This pair of clinical parameters are called _EVENT and _TIME_TO_EVENT on the cancer browser. 

Recurrence Free Survival (RFS) The event call is derived from "new_tumor_event_after_initial_treatment" parameter. The time_to_event is in days, equals to max (days_to_new_tumor_event_after_initial_treatment, days_to_tumor_recurrence) if there is an event; in the case of no event, the time variable is time of overall survival. The pair of clinical parameters are called _RFS and _RFS_IND on the cancer browser. 

KM plot
If there is OS data, the browser KM plot will display by default.  Users can use the KM plot advanced option to select other clinical variables, such _RFS and _RFS_IND to use for KM plot.

Example: TCGA bladder cancer recurrent free survival KM plot

Download OS or RFS data for survival statistical analysis
You can download the OS and OS_IND and RFS and RFS_IND pair of data through clinical download, as well as the categorical clinical variable (e.g. PAM50 subtype) for survival analysis. The downloaded data is a text file, data is in the format that can be easily used by R (e.g. survdiff in the survival package) to derive p value. Please note to select "entire clinical cohort" option when download clinical data for survival analysis.

happy...@gmail.com

unread,
Jun 21, 2015, 11:02:30 PM6/21/15
to ucsc-cancer-ge...@googlegroups.com
Dear Dr. Zhu,

Thanks so much for the note! It is very helpful. I am just wondering if RFS information could apply to all the diseases covered by TCGA or bladder cancer only. According to http://www.cbioportal.org/web_api.jsp, I have found clinical information related to survival on DFS (DiseaseFree survival). Would you mind suggesting some more detailed explanation for DFS as well? It is greatly appreciated.

Thanks!

Mary Goldman

unread,
Jun 22, 2015, 10:45:40 AM6/22/15
to happy...@gmail.com, ucsc-cancer-ge...@googlegroups.com
Hello,

We do not curate the Disease Free Survival at the cBio Portal. I would recommend asking them how they calculated it (cbiop...@googlegroups.com).

Best,
Mary
-------------
Mary Goldman
UCSC Cancer Browser
https://genome-cancer.ucsc.edu/

--
You received this message because you are subscribed to the Google Groups "UCSC Cancer Genomics Browser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ucsc-cancer-genomics...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Pete Kim

unread,
Jun 29, 2015, 2:41:08 PM6/29/15
to ucsc-cancer-ge...@googlegroups.com, happy...@gmail.com
Hello,

I googled some columns of ucsc data, OS, OS_IND, RFS, RFS_IND.
but I could not find out that OS_IND and RFS_IND!!
Plus, I could not catch what the data they have mean.
The columns have 0, 1, or null.
Can you tell me the description of columns(OS_IND, RFS_IND) and what the values that columns have, 0 or 1 mean??

Thank you!!

Sincerely,
Pete



2015년 6월 22일 월요일 오후 11시 45분 40초 UTC+9, Mary Goldman 님의 말:
To unsubscribe from this group and stop receiving emails from it, send an email to ucsc-cancer-genomics-browser+unsub...@googlegroups.com.

Mary Goldman

unread,
Jun 29, 2015, 5:13:35 PM6/29/15
to Pete Kim, ucsc-cancer-ge...@googlegroups.com, happysundae
Hi Pete,

For OS_IND, RFS_IND, and _EVENT, 1=death 0=censor and null=no data.
To unsubscribe from this group and stop receiving emails from it, send an email to ucsc-cancer-genomics...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "UCSC Cancer Genomics Browser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ucsc-cancer-genomics...@googlegroups.com.

Arvind Murali

unread,
Jan 14, 2016, 4:00:53 PM1/14/16
to UCSC Xena and Cancer Genomics Browser
Dear Dr. Zhu. 

I have a question. 


Several samples have days_to_death or days_to_last_contact as early as 2000 days. At that time TCGA was not even started. Are these patients from other sources? If so there will be a huge bias that only live people from those would have been recruited to TCHA while dead ones were never shown up in TCGA. How do you account for this? To normalize myself, I got information on days_to_collection or days_to_procurement of the samples but most of it is empty (about 80%). With the rest, I get very less number of samples which can’t really be used for any survival analysis. How do you fix this issue in the UCSC genome browser?


Best

Arvind


On Thursday, July 11, 2013 at 3:31:24 AM UTC-4, Jing Zhu wrote:

Jing Zhu

unread,
Jan 14, 2016, 4:08:02 PM1/14/16
to Arvind Murali, UCSC Xena and Cancer Genomics Browser
> days_to_death or days_to_last_contact as early as 2000 days
These are the number of days since diagnosis, not year 2000.

--
You received this message because you are subscribed to the Google Groups "UCSC Xena and Cancer Genomics Browser" group.
Reply all
Reply to author
Forward
0 new messages