Hello amazing colleagues,
Thank you for providing all these valuable datasets to the open-source community.
I noticed something regarding the isoform quantification data (also used in the Xena Browser for transcript view): many colon/rectum samples from TCGA appear to be missing in the dataset. I would like to know whether these samples might be available elsewhere, or if there was an issue that prevented their inclusion.
I have prepared a list of the missing TCGA colorectal samples (from COADREAD) and attached it here in case it helps with checking.
Hello Mary,
Great I will have a look, thanks for the fast reply. 😊
Best,
Theo
From:
Mary Goldman <ma...@soe.ucsc.edu>
Date: Monday, 1 September 2025 at 16:36
To: IzThed <teof...@hotmail.com>
Cc: UCSC Xena and Cancer Genomics Browser <ucsc-cancer-ge...@googlegroups.com>
Subject: Re: [ucsc-cancer-genomics-browser] Missing samples from: TCGA TARGET GTEx transcript expression by RSEM using UCSC TOIL RNA-seq recompute
Hello,
So glad that our tool is useful for you! For these samples it may be helpful to review how the authors who generated this data filtered samples. There is information on this on our data hub page here: https://xenabrowser.net/datapages/?host=https%3A%2F%2Ftoil.xenahubs.net as well as in the manuscript itself: https://www.nature.com/articles/nbt.3772. I will note that the group that produced this dataset does not currently have the funding to process more samples.
Best,
Mary
-----
Mary Goldman (she/her), Design and Outreach Engineer