TPM expression values from TCGA

trinc...@gmail.com

unread,

Dec 13, 2016, 12:03:54 PM12/13/16

to UCSC Xena and Cancer Genomics Browser

Dear UCSC Community,

We are interested on analizing the data thta you have processed with UCSC Xena. We downloaded the TPM expression values from TCGA. We expected to find all the TCGA samples with available RNA-seq data in this tables, but we have found some that doesn't appear. For instance, TCGA-E2-A108 according to the GDC Data Portal (https://gdc-portal.nci.nih.gov/) is a BRCA sample with RNA-seq data. However, it doesn't show up in the "Transcript RSEM tpm" file from TCGA TARGET GTEx cohort. This seems surprising since this sample had also RNA-seq data in the previous version of the TCGA portal (https://tcga-data.nci.nih.gov/) Is there any filtering at the time of processing the TCGA data? When was the last time that Xena updated their tables?

Thanks for helping,

Best regards.
Juanlu Trincado.

Jing Zhu

unread,

Dec 13, 2016, 2:18:40 PM12/13/16

to trinc...@gmail.com, UCSC Xena and Cancer Genomics Browser

Dear Juanlu,

The TCGA TPM expression are produced at UCSC https://xenabrowser.net/datapages/?host=https://toil.xenahubs.net . This in not an ingestion of the GDC data. UCSC group realigned and recalled all TCGA, GTEX samples using our own pipeline. Please see http://dx.doi.org/10.1101/062497 for detail.

This case's RNAseq analysis is not part of the UCSC recompute results.

Jing

--
You received this message because you are subscribed to the Google Groups "UCSC Xena and Cancer Genomics Browser" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ucsc-cancer-genomics-browser+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Juan Luis Trincado

unread,

Dec 14, 2016, 12:59:06 PM12/14/16

to UCSC Xena and Cancer Genomics Browser, trinc...@gmail.com

Dear Jing,

thanks for the quick answer.

I'm afraid I still don't understand this. I thought that Xena included all the TCGA. Is there a part from TCGA that is not computed by UCSC? Why is not this example included in the UCSC results, since it belongs to TCGA?

Thanks again for your time.
Regards,
Juanlu.

Jing Zhu

unread,

Dec 14, 2016, 1:26:44 PM12/14/16

to Juan Luis Trincado, UCSC Xena and Cancer Genomics Browser

Not quite sure why it is the case for your example, until the internal UCSC team to investigate.

It could be three possibilities, our internal team do not have the raw sequence files to compute to begin with. The second is data quality, some raw sequence data does not reach data quality we requires and will be throw out by ucsc pipeline. The third is other failure rate. The combined 2 and 3 is around 0.5% of total samples with raw sequence available.

Typically, we will push for your type of case if there is a particular biological hypothesis you try to test that is hinged on this sample. In most cases, we see that adding or subtracting one sample does not affect biological conclusions researchers try to draw.

Jing

To unsubscribe from this group and stop receiving emails from it, send an email to ucsc-cancer-genomics-browser+unsubs...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward