Request for Information regarding the HPV Status in the TCGA-HNSC Dataset

12 views
Skip to first unread message

Arian Michael Daschner

unread,
Oct 21, 2025, 10:09:18 AMOct 21
to cbiop...@googlegroups.com
Dear Sir or Madam,

My name is Arian Michael Daschner, and I am a doctoral student at the Walter Brendel Centre for Experimental Medicine at LMU Munich, Germany.

For my research, I am currently analyzing the TCGA-HNSC dataset, which I downloaded from the GDC Data Portal. As part of my work, I am trying to evaluate the samples regarding their HPV status. Unfortunately, the HPV status is not available in the files accessible via the GDC Data Portal.

However, I was able to find this information on your website in the table "Head and Neck Squamous Cell Carcinoma (TCGA, PanCancer Atlas)". Below the headline, you cite the original data source, which leads to the GDC website. Some of the data there, particularly the HPV status, do not seem to be accessible at GDC.

Therefore, I would like to kindly ask how you obtained the information regarding the HPV status of the samples, as I could not find a description of the method on your website.

I would very much appreciate your assistance with this matter.

Yours sincerely,
Arian Michael Daschner

Ritika Kundra

unread,
Oct 22, 2025, 10:25:55 AMOct 22
to Arian Michael Daschner, cbiop...@googlegroups.com
Hi Arian,

Do you mind sharing the attribute name in the portal study you are referring to?

Thanks,
Ritika

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/cbioportal/999211474.47895.1761036647310%40vzd-app4.zuv.uni-muenchen.de.

Arian Michael Daschner

unread,
Oct 23, 2025, 6:57:52 AMOct 23
to Ritika Kundra, Arian Michael Daschner, cbiop...@googlegroups.com
Dear Ritika,

thank you very much for your quick reply.

In the cBioPortal study "Head and Neck Squamous Cell Carcinoma (TCGA, PanCancer Atlas)", there is a column named "Subtype" in the Clinical Data table, which provides information about the patients’ HPV status.

Below the table headline, there is a note stating “the original data is here”, which links to the GDC website under the page titled “TCGA-PanCanAtlas Publications.

From that landing page, I downloaded the “Clinical with Follow-up” table, which contains data from multiple TCGA projects. I opened this table in R, sorted out all samples, which were not HNSC and examined all remaining columns that might provide HPV-related information for TCGA-HNSC patients. These columns were:
hpv_test, hpv_status_by_ish_testing, hpv_status_by_p16_testing, human_papillomavirus_other_type_text, human_papillomavirus_laboratory_procedure_performed_text, and human_papillomavirus_type.

After analyzing these columns, I was able to identify 81 HPV-negative and 44 HPV-positive samples.
However, the cBioPortal table lists HPV information for 487 samples in total.

My question, therefore, is how the HPV status data in cBioPortal was collected, since I could not find a description of the method, which was used, on the website.

I also searched for this information in the clinical data table, which can be downloaded from the GDC Data Portal. Unfortunately, this table didn´t provide any information regarding this.

I would highly appreciate your help with this.

Kind regards,
Arian


> Ursprüngliche Nachricht:
> Von: Ritika Kundra <ritika...@gmail.com>
> An: Arian Michael Daschner <A.Das...@campus.lmu.de>
> Kopie: cbiop...@googlegroups.com
> Datum: Wed Oct 22 16:25:45 CEST 2025
Reply all
Reply to author
Forward
0 new messages