Programmatically loading data to mutation heatmap

32 views
Skip to first unread message

Kamile Taouk

unread,
Jun 23, 2025, 7:57:45 PM6/23/25
to ProteinPaint
Hi,

I'm currently trying to embed a ProteinPaint instance using this example as a reference.
I've managed to get it rendering correctly, and have substituted the `metadata`, `sampleannotation`, `samplegroup` and `genegroup` fields in the `heatmapJSON` param with my own data successfully. For the `studyview.mutationset` param, I'm trying to supply my own data files with a custom host defined, but the instance immediately crashes and complains about missing genome data.

So these are my questions:
  • What are the details/restrictions around overriding the default `https://proteinpaint.stjude.org` host?
  • Is it possible at all to load data files programmatically, rather than via the UI?
  • Is it possible to pass in the contents of those data files as JSON objects instead, in a similar manner to the `studyview` or `heatmapJSON` param?
  • If the above is not possible - are there other, more appropriate examples to use for the case of an OncoPrint-style heatmap for mutational data?
I've looked at the source code of many of the other examples within ProteinPaint; the mutation heatmap is by far the closest to our intended application.
I've also gone through almost every page of this documentation with no luck.
Finally, I've searched through the source code on GitHub, also to no avail.

Any help would be greatly appreciated!

Thanks,
Kam.



Edgar Sioson

unread,
Jun 26, 2025, 7:00:47 PM6/26/25
to Kamile Taouk, ProteinPaint
Hi Kamile,

Thanks for using ProteinPaint. Please note that the mutation heatmap that you are using is an older version ("sjcharts" app) - it is still being used but has less support.  Here are the answers to your questions:

- What are the details/restrictions around overriding the default `https://proteinpaint.stjude.org` host?

The ProteinPaint client code (browser javascript) expects a ProteinPaint server backend as "host", with a `/genomes` and possibly other data routes, depending on what is being visualized or other features that are launched from it, such as when clicking on a heatmap/matrix row label.

- Is it possible at all to load data files programmatically, rather than via the UI?
There is an API to dynamically update a visualized mutation heatmap, but we only support it for St Jude portals that are using this legacy feature, where runproteinpaint() would return a visualization instance with an update() method. However, please read through the end for the recommended "MASS UI" approach.

- Is it possible to pass in the contents of those data files as JSON objects instead, in a similar manner to the `studyview` or `heatmapJSON` param
If you're asking about reloading data as JSON object after the runproteinpaint() call, then see my previous answer above. If you are asking about loading data as JSON instead of tsv text (within the studyview option), that's not supported in the legacy code ("sjcharts" app).

- If the above is not possible - are there other, more appropriate examples to use for the case of an OncoPrint-style heatmap for mutational data?
There are 2 options:
A. If you are familiar with javascript code, you can make your own HTML to view your visualizations. You can use the legacy sjcharts by clearing the "holder" option and calling runproteinpaint() every time that you want to update it. So, you'd remove() the DOM element that holds the visualization before calling runproteinpaint(), then call it with the new data.
B. You can use the more recent "MASS UI" that has more features and support, but this will require restructuring and loading your data into database tables. We support a few St. Jude research groups this way, by helping them build databases with their own custom genetic and clinical data, and this consistent data structure is queried to dynamically filter and update visualizations.

Regards,
Edgar

--
You received this message because you are subscribed to the Google Groups "ProteinPaint" group.
To unsubscribe from this group and stop receiving emails from it, send an email to proteinpaint...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/proteinpaint/924e788b-0d58-4006-9a3b-4873768510a1n%40googlegroups.com.
Message has been deleted

Kamile Taouk

unread,
Jun 26, 2025, 9:29:40 PM6/26/25
to Edgar Sioson, ProteinPaint

Thanks for the quick reply Edgar!

 

I’m not at all opposed to using MASS UI, especially since our mutation data is already stored in a database and we’ve got plenty of experience with ETL processes.

Can I trouble you to point me to some resources/links for MASS UI? I wasn’t able to find anything in the GitHub documentation.

 

Thanks,
Kam.

Edgar Sioson

unread,
Jun 27, 2025, 1:47:06 AM6/27/25
to Kamile Taouk, ProteinPaint
Hi,

Unfortunately, we keep the backend database ETL code in a private repo, since it still requires quite a bit of coordination to intake data from the St. Jude research groups that we support. Our team currently does not have the bandwidth to fully document and open-source the ETL code. For each project, there always seems to be an edge case that takes a bit of coding to support before the ETL can be automated.

Also note that only the clinical, demographic, and other precomputed data are captured in SQLite database tables for MASS UI. Genomic data are expected to be in bcf and similar bioinformatics file formats. So, the ProteinPaint backend code uses both sqlite db and/or standard bioinformatics tools, together with reference files (hg19, hg38, snp, etc), to combine and generate data for visualizations. The only exception to these formatted data sources is when we are contracted to support  an external portal that requires querying data from an HTTP API, but that requires coding custom queries.

If you think your organization is interested in licensing or contracts, please let us know. You can preview some capabilities of the MASS UI using our test dataset (note that a few errors are expected due to test-only data truncation). 

Thanks,
Edgar

Kamile Taouk

unread,
Jun 27, 2025, 2:16:42 AM6/27/25
to Edgar Sioson, ProteinPaint

Thanks for that, Edgar.

 

I’m getting an ‘invalid session’ error with the MASS UI link you provided:

 

So to clarify:

  • If we do show interest in using MASS UI, what are the details surrounding our involvement in this project?
  • I can output our mutational data as csv (or equivalent) files – even so, is there no way to launch the `runproteinpaint()` instance with those files pre-loaded? This is the code I’m referring to specifically:

 

Appreciate the help so far!

Edgar Sioson

unread,
Jun 27, 2025, 2:54:32 AM6/27/25
to Kamile Taouk, ProteinPaint
You can try this link instead, click on CHARTS tab, then Sample Matrix button, then Matrix plot button. 

I'll answer your other questions tomorrow, it's almost midnight here. 

Edgar

Kamile Taouk

unread,
Jun 27, 2025, 4:39:03 PM6/27/25
to ProteinPaint
Thank you for the quick reply Edgar! I appreciate the help.

Could I trouble you for some links/references to MASS UI? I can't seem to find anything about it in the ProteinPaint documentation.

Thanks,
Kam.

Edgar Sioson

unread,
Jun 28, 2025, 1:17:34 PM6/28/25
to Kamile Taouk, ProteinPaint
Hi Kamile,

Sorry for the delayed follow-up, I'm currently on vacation and not 100% online.

"If we do show interest in using MASS UI, what are the details surrounding our involvement in this project?"
Institutional customers usually arrange paid contracts with us to set up and support portals, including the MASS UI and possibly other built-in applications. In some cases we also set up dedicated machines to host the container, to segregate sensitive research data from other customer portals.

"I can output our mutational data as csv (or equivalent) files – even so, is there no way to launch the `runproteinpaint()` instance with those files pre-loaded? This is the code I’m referring to specifically:"
- These csv files, when used to render the legacy mutational landscape, are accessed by the ProteinPaint server/container using the "tpmasterdir" + dataset entries in the mounted "serverconfig.json". If you want to dynamically update a rendered visualization in the same DOM div, you can code a custom html + javascript page that will remove()) the DOM element for the visualization, recreate it, and then call runproteinpaint() again with that DOM element as "holder" argument.
- The more up-to-date replacement is the Sample Matrix application in the MASS UI, and an example json file can be found here, where the "termgroups" are used to query the sqlite and bcf data in backend.

"Could I trouble you for some links/references to MASS UI? I can't seem to find anything about it in the ProteinPaint documentation."
- The MASS UI is not well documented for independent developer use. The applications and features are mostly demonstrated using our test data, and are used by multiple research portals in production environments (where we are contracted to provide support).
 
I realize I'm not able to point to clear documentation of the ETL and MASS UI. Per my previous reply, the ProteinPaint team currently does not have the bandwidth to fully document and support independent developer usage of ETL scripts and application APIs (such as for MASS UI). As it is, replying to emails and inquiries like these takes time.

However, if you want to get an idea of the processed/formatted data sources, please refer to the files in the test data folder, where you'd see the sqlite `db` file for clinical/demographic data, and TermdbTest.bcf.gz,  TermdbTest_CNV_gene.gz, and other files for mutation, CNV, etc. These data sources are mounted to the ppfull or ppserver container and used for interactive cohort filtering and visualization updates.

Regards,
Edgar

Kamile Taouk

unread,
Jun 30, 2025, 8:29:00 PM6/30/25
to Edgar Sioson, ProteinPaint

Thank you for your time Edgar, it’s much appreciated.

I’ll investigate further with my team.

 

From: Edgar Sioson <sios...@gmail.com>
Date: Sunday, 29 June 2025 at 3:17
am
To: Kamile Taouk <ktaou...@gmail.com>
Cc: ProteinPaint <protei...@googlegroups.com>
Subject: Re: Programmatically loading data to mutation heatmap

Reply all
Reply to author
Forward
0 new messages