Loading mutation data from two sources?

Zachary Whitfield (WHITFIEZ)

unread,

Mar 15, 2021, 3:20:08 PM3/15/21

to cBioPortal for Cancer Genomics Discussion Group

Hi everyone,

I am supporting a local instance of cBioPortal. For some samples, we have two separate sources of mutation data (method1 and method2). I am wondering if there is a way to load both sources of mutation data, while also tracking where each mutation came from.

I'm able to merge the two MAF files with no issue, but I'm not sure if there is a way to do so which could designate each mutation with a 'source' and also allow the user to filter mutations to only one source or another (e.g. only display mutations detected by method1).

I tried to use the 'namespaces' field, but I did not see those fields get kept in the mutations table. According to some issues on the GitHub repo I found, maybe this feature is not fully implemented yet (here and here)?

Apologies if this has been asked before, I tried to search through previous conversations before posting.

Thanks so much for any help,

Zach

Sjoerd van Hagen

unread,

Mar 18, 2021, 10:08:28 AM3/18/21

to Zachary Whitfield (WHITFIEZ), cBioPortal for Cancer Genomics Discussion Group

Hi Zach,

I think the only option that would work at this time is to have two samples in cBioPortal for each biological sample that you sequenced. You can then add 'source' as a clinical variable to the sample, allowing you to select only the samples from one source at a time, or compare samples from different sources in the group comparison to spot differences. I think this will probably cover your use cases.

The namespaces method does not work at the moment, and is also not meant for your use case. It is only useful to add annotations to mutations.

I hope this helps.

Best,

Sjoerd.

---

Sjoerd van Hagen

Team Lead cBioPortal & Open Targets

E sjo...@thehyve.nl

T +31 30 700 9713

W thehyve.nl

--
You received this message because you are subscribed to the Google Groups "cBioPortal for Cancer Genomics Discussion Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cbioportal+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/a6b5993c-5f8e-44e7-85f7-1217920ba348n%40googlegroups.com.

Zachary Whitfield (WHITFIEZ)

unread,

Mar 18, 2021, 11:36:07 AM3/18/21

to cBioPortal for Cancer Genomics Discussion Group

Hi Sjoerd,

Thank you so much for the response! That definitely is helpful.

I should have mentioned it in my initial post, but we also have RNAseq expression data for some of these samples as well. If we create a second mutation sample, that would cause the linkage between mutation and expression to be lost for the new ids correct? For example, when creating an "mRNA vs. mut type" plot.

Is that right? Do you see a way around that issue?

For the namespaces, my understanding is that it will allow us to annotate the mutations (as you said), but not filter. Is that correct?

Thanks again for your help,

Zach

Sjoerd van Hagen

unread,

Mar 18, 2021, 11:41:52 AM3/18/21

to Zachary Whitfield (WHITFIEZ), cBioPortal for Cancer Genomics Discussion Group

Hi Zach,

You are correct on both accounts.

The issue with the RNA Seq can be circumvented by loading that for both samples. It would allow you to do the plotting and you would not even notice this if you look to the oncoprint at patient level.

If you think that this may confuse the users, there is also the option to load the samples in two separate studies. One will have the samples from source1, the other from source2, and you would copy all the other data (clinical, expression, etc). This way it will be very clear to the user and they can still do comparisons. The only drawback would be that you need a bit more storage.

Best,

Sjoerd.

---

Sjoerd van Hagen

Team Lead cBioPortal & Open Targets

E sjo...@thehyve.nl

T +31 30 700 9713

W thehyve.nl

To view this discussion on the web visit https://groups.google.com/d/msgid/cbioportal/664e5262-5901-40ea-a55e-becdf56e1afan%40googlegroups.com.

Zachary Whitfield (WHITFIEZ)

unread,

Mar 18, 2021, 12:53:35 PM3/18/21

to cBioPortal for Cancer Genomics Discussion Group

Thanks again Sjoerd. I will try out both of your suggested solutions, but I think the suggestion of two individual studies seems like the best fit for our use case.