How to fake both dimension and group in dc bar chart?

753 views
Skip to first unread message

Anon

unread,
Nov 29, 2016, 6:30:04 AM11/29/16
to dc-js user group
I am having a dataset in JSON format like below.
[[{
    "Userid": "276725",
    "ISBN": "034545104X",
    "Rating": "0"
}, {
    "Userid": "276726",
    "ISBN": "0155061224",
    "Rating": "5"
}, {
    "Userid": "276727",
    "ISBN": "0446520802",
    "Rating": "0"
},.......]

So multiple users give ratings to multiple books. I created dimension with ISBN and grouped to reduce the record to calculate average rating for each book

var dimisbn = crsfltrratings.dimension(function(d){return d.ISBN;});
  var averagerating = dimisbn.group().reduce(
    function (p,v){
      ++p.count;
      p.total += parseInt(v.Rating);
      p.average = Math.floor((p.total/p.count));
      return p;
    },

    function (p,v){
      --p.count;
      p.total -= parseInt(v.Rating);
      p.average = Math.floor((p.total/p.count));
      return p;
    },

    function (){
      return{
        total:0,
        count:0,
        average:0
      };
    }

  ).all();

I want to further reduce the data like the number of books for each average say 1 to 10.
[{key:1,value:15000},{key:2, value:20000},....]

To achieve this I used d3.nest to group the data with average.

var expenseMetrics = d3.nest()
  .key(function(d) { return d.value.average; })
  .rollup(function(v) { return v.length; })
  .entries(averagerating);

But I don't know how to plot the data with dc.js. Because my dimension and group are entirely different. I want bar chart with average ratings on x-axis and number of books for each rating in y-axis. Any suggestion please?

Gordon Woodhull

unread,
Nov 29, 2016, 8:30:51 AM11/29/16
to dc.js user group
Hello,

The old double-reduce, I see...

It's important to remember what the dimension and group are used for. The rule that the chart should use a group from the same dimension, mostly has to do with how you want things to filter.

Unless you consider data tables (which use dimension.top() or dimension.bottom() to fetch raw rows of data), the dimension is usually the "controller" and the group is usually the "model", and dc.js is the "view". 

So you're controlling what's shown by filtering on the dimension, and you're reading the data from the group. That's really all there is to dc.js; otherwise it's just a "d3.js cookbook".

So I'd suggest wrapping your nest in a fake group, so that it's run dynamically when other dimensions are filtered:

var averageRatingGroup = {
  all: function() {
    var expenseMetrics = d3.nest()
      .key(function(d) { return d.value.average; })
      .rollup(function(v) { return v.length; })
      .entries(averagerating.all()); 
    return expenseMetrics;
  }
};

This fetches the reduced averagerating group data each time it's called, and nests it.

We have two options for faking the dimension. Since all charts except the data table only use the dimension for filtering, you could just fake your dimension as {} and specify a filterHandler to your chart. (See e.g. [1] for an example.) Or you could fake the appropriate filter function(s) of the dimension. 

The second approach is a tiny bit longer but I think it's clearer, so let's do that here .

For the fake dimension, we need to figure out how to filter the books by average rating. Assuming that we're not using dimisbn for anything else, we can create a fake dimension that filters on a continuous range of ratings and applies an item-by-item filter to dimisbn:

var averageRatingDimension = {
  filter: function(f) { // #1
    if(f === null)
      dimisbn.filter(null);
    else throw new Error("uh oh don't know what to do here");
  },
  filterRange: function(r) { // #2
    var isbns = {}; // #3
    averagerating.all().forEach(function(kv) { #4
      if(r[0] <= kv.value.average && kv.value.average < r[1]) { // #5
        isbns[kv.key] = true;
      }
    });
    dimisbn.filterFunction(function(d) { // #6
      return !!isbns[d.ISBN];
    });
  }
};

This assumes you're using dc.js >= beta.19, which include Ethan's optimizations to call the more efficient filterRange when filtering a range. (With earlier versions, you'd be forced to use filterHandler.)

It also assumes you're using a continuous/quantitative scale for your chart, which will enable range filtering. I just think that's more appropriate. If you want an ordinal scale, I think you're also back to the filterHandler approach, but otherwise it's pretty similar.

Detailed explanation:
1. We expect dc.js to either call .filter(null) or .filterRange(range) - we reset on .filter(null) and barf if we get anything else.
2. .filterRange() will take an array of [low, high) bounds
3. we'll build a hash which maps ISBNs to boolean
4. we'll loop over all the reduced averages and mark ISBNs if they match the range
5. by convention, the range does not include the high value - this is usually the expected behavior for a brush, but you can adjust to taste
6. replace the dimisbn filter with one that only accepts the ISBNs marked with true

NOTE: this code is untested. I may have made some mistakes. 

If it doesn't work or if you need further clarification, please build a jsFiddle with example data [2] and charts so that I can help troubleshoot. (bl.ocks are also welcome.) See [3] for fiddles and blocks you can fork to get started with the right dependencies.

Cheers,
Gordon



--
You received this message because you are subscribed to the Google Groups "dc-js user group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-user-gro...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dc-js-user-group/bd58a5ab-8587-4dfa-b35b-eb82ea449169%40googlegroups.com.

Shea Parkes

unread,
Jan 7, 2019, 4:54:11 PM1/7/19
to dc-js user group
I know this is an older email chain, but I found it very helpful.  I first wanted to say: thanks for authoring and maintaining such a useful library.  And thanks for keeping the bindings to crossfilter clean enough we can shim between the libraries with these "fake" groups and dimensions.

We adapted your suggestions above and they worked great.

One item did surprise us though, the closure in #6 (i.e. the closure passed to dimisbn.filterFunction) was updated even when new filters were applied to other charts (e.g. if ISBNs are books above, filtering to just Fiction or NonFiction in a separate PieChart).  We were worried we'd have to force a recalculation of the closure when filters were applied to other charts.

I wanted to know why that happened though, so with some debugging, we discovered that when the chart group is being redrawn (after any chart is filtered), and a chart that has these fake group/dimensions is redrawn (by being in the same chart group), the re-drawing of the brush triggers a re-evaluation of the filters.  I worry that wasn't actually by design, and that it was just left that way because in a standard rangeFilter scenario it is practically free (performance wise) to re-apply the same rangeFilter values over themselves.  We are currently trusting that mechanism to refresh the closure, but future users of this logic should beware.  This email was as-of DC.js v3.0.9

I know there are examples of fake groups in the documentation, if I get some time I'll see about tossing in an example of a fake dimension like above as well. (This post was rather hard to find, and we'd done something pretty stupid to get similar functionality before we found this.)

Thanks again!

Gordon Woodhull

unread,
Jan 8, 2019, 2:22:59 PM1/8/19
to dc-js user group
Hi Shea,

You are very welcome!

Thanks for bringing this up. It could cause performance problems, and should be fixed.

It is a consequence of our move to d3v4, which makes it harder to apply a brushing range without receiving a callback about it.

We attempt to detect the reason for a callback in coordinateGridMixin._brushing

if (d3.event.sourceEvent.type && ['start', 'brush', 'end'].indexOf(d3.event.sourceEvent.type) !== -1) {
    return;
}


but this only catches the recursive case (where the chart is redrawing because the brush changed), not the case where another chart has updated and this causes a redraw of the brush.

You are right that if it's not a regular crossfilter dimension, as with the "fake dimension" here, it may be pretty expensive. (dimension.filterFunction is very expensive, checking every row of the data.)


As a stopgap, you could check if the range has changed. But we definitely should figure out a better way to detect if the _brushing event is meaningful.

Cheers,
Gordon

Isamoor

unread,
Jan 8, 2019, 3:49:09 PM1/8/19
to dc-js-us...@googlegroups.com
Thank you for the swift reply, and for opening a related issue.

I'm replying here (instead of the issue) because this follow up question/comment is specific to this use case (and not the bug documented in the issue).

In the ISBN example, if another filter changed the average rating by user, then we do actually need to redo the filterFunction.  This means the current brush draw error is a boon at the moment (but please do fix it as time allows, we don't want to rely on a bug to support our use case).

However, it would be good to have a future proof way of handling this.  My first thought would be to use the "preRedraw" hook on the chart.  However, that would likely cause some stuttery animation on the other charts if the filterFunction was updated and then called another redrawAll.  The next option I considered is to put a callback on the "filtered" event of every other chart.  That'd be a bit annoying/fragile to keep accurate as you add and remove charts to your solution, but likely accurate at least (and timely to get the animation transitions correct).  A crossfilter on change callback would also work, but I'm curious how that would interact with the animations as well.

If you have a moment Gordon, would you be able to opine on the ~best way to smoothly trigger revaluation of that closure?  No worries if you're too busy; we can work through some of the ideas above.

--
You received this message because you are subscribed to the Google Groups "dc-js user group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dc-js-user-gro...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages