Advice for using vega for an interactive heatmap?

351 views

Skip to first unread message

Matt Huyck

unread,

Nov 30, 2015, 1:31:30 PM11/30/15

to vega-js

Hello all,

I am building an interactive heatmap tool to allow collaborators to explore how their gene expression data look when analyzed using a novel method created in our lab. I am an experienced developer but new to vega (fresh off the tutorial plus a bit of tinkering with some examples).

I was able to get a working prototype pulled together very quickly by adapting the heatmap example here:

http://vega.github.io/vega-editor/?mode=vega&spec=heatmap&renderer=svg

Kudos to the vega team for making it so easy to get started! I am scratching my head now at how I should go about adding the interactivity we envision.

The key difference between the abstract heatmap I'm building and the map of temperature over time in the example above is that there is no natural ordering to the gene expression samples (on the Y axis where the example has hour of day) nor is there any strict order to the other axis, which has "nodes" comprising the activity of various collections of genes (the example has date). We would like a user to be able to select from a list of "sorting" options to choose how to re-order the samples and nodes. If possible, we would like to do all of the visualization in the web browser.

For a gene expression heatmap, the typical sorting order is determined by hierarchical clustering (previous work for a manuscript was done using heatmap.2 from R's gplots http://cran.cnr.berkeley.edu/web/packages/gplots/index.html ). I have not yet seen anything done in vega that performs hierarchical clustering (please correct me if I'm wrong!), so it appears I would need to do this using an external JS library. The challenge then becomes how to (a) get vega to call out to this library or (b) write some code that hands off the clustered data to vega for rendering.

Approach (a) seems to be possible using a custom-built transform, if I am following the description here correctly:

https://github.com/vega/vega/wiki/Upgrading-data-transforms-to-2.0

though that page also refers to a plugin architecture in the works, which would seem to be closer to what I'm looking for.

Approach (b) would mean handling the data outside of vega initially, obtaining the clustering results and then invoking vega using the streaming API to insert or update the dataset with those results.

Can anyone comment on whether this would be a good task for vega (our alternative is D3) and, if so, whether approach (a) or (b) -- or perhaps another way I haven't yet identified -- would be a good way to proceed?

Thank you in advance for any words of guidance you have to offer,

-Matt Huyck

P.S. Heatmaps seem like the most natural way to visualize these data, but we are also open to other methods if you have suggestions.

Roy I

unread,

Dec 7, 2015, 5:21:47 PM12/7/15

to vega-js

Hi Matt,

This is an example of gene expression heat map using d3.js. A basic heat map such as this can be easily accomplished using Vega.

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4023661/

(javascript with d3.js source code) https://zenodo.org/record/7706#.VmIXz9KrTmE

The latest Vega version (v2.4.0 / 2.4.1) just released has "treeify" transform and example for cluster dendrogram, but it does not accept parameter for height (length) of lines in the hierachy.

https://github.com/vega/vega/release

https://github.com/vega/vega/wiki/Data-Transforms

This version of Vega streaming API does not support inserting or modifying data in tree format. You would have to process the data and have Vega draw or re-draw the entire graph.

Other resources:

Clustering algorithms implemented in Javascript:

https://code.google.com/p/figue/wiki/Introduction

http://harthur.github.io/clusterfck/

Example of interactive app (challenging project for Vega):

http://www.broadinstitute.org/cancer/software/GENE-E/index.html

Reply all

Reply to author

Forward

0 new messages