Aggregation functions with multiple columns

2,073 views
Skip to first unread message

NA

unread,
Jun 22, 2011, 1:34:35 PM6/22/11
to Google Visualization API
Hi,

Is there a clean approach to allowing DataTable aggregation functions
to accept multiple columns? For example, if a column's aggregation
function is a weighted average of its values weighted by another
column's values, I'd want to pass both columns to my aggregation
function. Is there an easy way to do this? Weighted averages are
extremely common to compute in many real life applications, but I
didn't see a straightforward way of doing this with the current
DataTable implementation. (Making a calculated column and then
averaging that might work for this example, but creating new views
with calculated columns just to compute an aggregation for another
already existing column is a lot of extra work, and can become
cumbersome and errorprone for more complex calculations).

I've been working around this limitation by using currying and
closures to define aggregation functions that use two or more columns,
but present an interface to the DataTable that uses one column. I had
to extend the Function class to permit currying, which doesn't feel
good. Perhaps there's an easier method in js, but this is the only
way that came to mind.

So my questions:

- is there a clean Javascript method of effectively implementing an
aggregation function that accepts multiple rows?
- is there any reason that the aggregation methods don't look at the
column value and, if it's a scalar, do the current implementation, but
if it's an array, pass all of those columns to the aggregation
function?

Sort of the same question ;) I'll put in an enhancement request if
there are no good suggestions, but I want to hear from more
experienced js coders if there's a more elegant solution that I'm
missing.

thanks,

Viz Kid

unread,
Jun 24, 2011, 9:33:00 AM6/24/11
to google-visua...@googlegroups.com

It seems that using a DataView would be the simplest solution and should not require a lot of extra work. I'm not sure I understood your second question as the value in each cell of the table is always one of the basic types we support and it cannot be an array of them.

Best,
  Viz Kid


--
You received this message because you are subscribed to the Google Groups "Google Visualization API" group.
To post to this group, send email to google-visua...@googlegroups.com.
To unsubscribe from this group, send email to google-visualizati...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-visualization-api?hl=en.


NA

unread,
Jun 24, 2011, 9:58:19 PM6/24/11
to Google Visualization API
So can you present an example using DataView? I can't see a
straightforward way to do this, but I'll give you the benefit of the
doubt. Show me how you'd do the following:

table has these columns:

0 1 2 3 4 5 6 7
id, sector, price, shares, weight, f1, f2, f3

I want to aggregate this by sector. The aggregation functions for
price, f1, f2, and f3 is a weighted average. The aggregation for
shares and weight is a sum. The aggregation for id and sector is to
return the string "Many" if there are multiple values in that column,
or if all the entries are the same return that value.

Such aggregation functions might look like:

function WeightedAverage(q,w) {
var wsum = 0;
for (i=0;i<w.length;i++) {wsum+= w[i]*q[i];}
return wsum;
}

function AllSameOrMany(c) {
var r = c[0];
for (var i=0;i<c.length;i++) {if (r !=c[i]){return 'Many';}};
return r;
}

Note that the WeightedAverage function is a general function that
doesn't require the weight to always be in column 4. It also doesn't
know what Tables are. It's used in many places; its existence
predates the google visualization API. This is important for
reusability, maintainability, and interoperability across many
libraries.

What I'd like to do is;

bySector = new google.visualization.data.group(table,[1],
[ {column:0, aggregation:AllSameOrMany, type:'string'}
,{column:[2,4],aggregation:WeightedAverage,type:'number'}
,{column:3, aggregation:google.visualization.data.sum,type:'number'}
,{column:4, aggregation:google.visualization.data.sum,type:'number'}
,{column:[5,4], aggregation:WeightedAverage,type:'number'}
,{column:[6,4], aggregation:WeightedAverage,type:'number'}
,{column:[7,4], aggregation:WeightedAverage,type:'number'}
]
);

Since this syntax doesn't exist, can you show me how to do this with
google.visualization.data.group or with DataViews, without having to
create special versions of my aggregation functions?

In my case, I used currying to wrap functions like WeightedAverage in
a way to accommodate the grouping and DataView APIs.

But I'd like to learn a cleaner way of doing this.

thanks,

NA

unread,
Jun 28, 2011, 10:04:43 PM6/28/11
to Google Visualization API
VIz Kid, can you post that example?

Viz Kid

unread,
Jun 29, 2011, 3:43:15 AM6/29/11
to google-visua...@googlegroups.com

Hi

I stated that this can be done using a DataView, but I did not say that it would be as clean as you would like, especially satisfying your request to use the general existing weighted average function as is. If the WeightedAverage function is indeed what you wrote (I did not see any normalization there so I wasn't sure), you can indeed get the desired outcome using first a DataView to create the weighted columns [2,5,6,7] (by multiplying their value by the weight) and then applying the group call as you did it where replacing the WeightedAverage with the sum function.

I agree that it would be more natural to have the syntax as you wrote it available for this use case but currently it simply does not exist.

Here is a snippet of the code:

view = new google.visualization.DataView(table);
view.setColumns([
  0,
  1,
  {calc: weightedColumn(2, 4), type: 'number'}, 
  3,
  4,
  {calc: weightedColumn(5, 4), type: 'number'}, 
  {calc: weightedColumn(6, 4), type: 'number'}, 
  {calc: weightedColumn(7, 4), type: 'number'}]);

bySector = new google.visualization.data.group(view, [1],

 [ {column:0, aggregation:AllSameOrMany, type:'string'}
  ,{column:2, aggregation:google.visualization.data.sum,type:'number'}

  ,{column:3, aggregation:google.visualization.data.sum,type:'number'}
  ,{column:4, aggregation:google.visualization.data.sum,type:'number'}
  ,{column:5, aggregation:google.visualization.data.sum,type:'number'}
  ,{column:6, aggregation:google.visualization.data.sum,type:'number'}
  ,{column:7, aggregation:google.visualization.data.sum,type:'number'}
 ]
);

function weightedColumn(dataColumnIndex, wightsColumnIndex) {
  return function(dataTable, rowNum) {
    return dataTable.getValue(rowNum, dataColumnIndex) * dataTable.getValue(rowNum, weightsColumnIndex);
  }
}

Best,
  Viz Kid

NA

unread,
Jun 29, 2011, 1:46:46 PM6/29/11
to Google Visualization API
Ok, ty for the example. It's what I had in mind.

I'm going to enter an enhancement request for the syntax I described.

I've personally been part of 3 different datatable implementations in
2 different languages, and we found that supporting this syntax allows
one to do much more sophisticated aggregations. In certain
application areas, such as finance, these types of aggregations are
critical. Google's is the first I see in Javascript. It's nice, a
little different. But I think it would benefit from some of the
experience of other table implementations.

Thanks for taking the time to post the example, I appreciate it.

NA

unread,
Jun 29, 2011, 1:50:38 PM6/29/11
to Google Visualization API
Enhancement request here: http://code.google.com/p/google-visualization-api-issues/issues/detail?id=631

If you found this thread by searching, please vote for this issue to
help raise its priority.


Michelle Stewart

unread,
Jan 8, 2013, 11:28:50 AM1/8/13
to google-visua...@googlegroups.com
I am doing something very similar so I was using this code, but apparently I am missing something obvious because my view isn't working.

I am using arrayToDataTable because of the data I have, but 
<code>
var non_grouped = google.visualization.arrayToDataTable(oeeChartsData);
var view = google.visualization.DataView(non_grouped);
/* Non_grouped Columns
0 - Date, 1 - Shift, 2-AAA, 3- BBB
4 - Downtime, 5 - Scheduled, 6 - Uptime, 
7 - Target, 8- Scrapped, 9- Made
*/
view.setColumns([0,1,2,3,
{calc: getAvail(4,6), type: 'number'},
{calc: getPerf(9,7,6), type: 'number'},
{calc: getQual(9, 8), type: 'number'},
{calc: getOEE(4,7,6,9,8), type: 'number'}]);
</code>
errors on the setColumns line because the view never got created!  All my functions work, any ideas what could be wrong?
non_grouped is a dataTable just fine.  

Thanks in advance,
Michelle

Best,
  Viz Kid

To unsubscribe from this group, send email to google-visualization-api+unsub...@googlegroups.com.

Michelle Stewart

unread,
Jan 8, 2013, 11:33:40 AM1/8/13
to google-visua...@googlegroups.com
NEW it works with var view = new google.visualization.DataView(non_grouped).  I knew it was something obvious!  Thanks

asgallant

unread,
Jan 8, 2013, 11:35:44 AM1/8/13
to google-visua...@googlegroups.com
What is the exact error being thrown here?


On Tuesday, January 8, 2013 11:28:50 AM UTC-5, Michelle Stewart wrote:

Michelle Stewart

unread,
Jan 8, 2013, 11:42:27 AM1/8/13
to google-visua...@googlegroups.com
I got that solved.  I just forgot the new.  It was something to the effect of "Cannot call setColumns on undefined".

I do need help however on figuring out why it seems the values being used for the calculations are the column numbers instead of the column values.

var view = new google.visualization.DataView(non_grouped);
/* Non_grouped Columns
0 - Date, 1 - Shift, 2-Press, 3- Mold,
4 - Downtime, 5 - Scheduled, 6 - Uptime, 
7 - Target, 8- Scrapped, 9- Made
*/
view.setColumns([0,1,2,3,
{calc: getAvail(4,6), type: 'number'},
{calc: getPerf(9,7,6), type: 'number'},
{calc: getQual(9, 8), type: 'number'},
{calc: getOEE(4,7,6,9,8), type: 'number'}]);
/*  View Columns
0 - Date, , 1 - Shift, 2 - Press, 3 - Mold
4 - Availability, 5 - Performance, 6 -  Quality, 7 - OEE
*/
showThis = new google.visualization.data.group(view, [0],
[{column: 4, aggregation: google.visualization.data.avg, type: 'number'},
{column: 5, aggregation: google.visualization.data.avg, type: 'number'},
{column: 6, aggregation: google.visualization.data.avg, type: 'number'},
{column: 7, aggregation: google.visualization.data.avg, type: 'number'}]);
    var chart = new google.visualization.LineChart(document.getElementById('chart-oee'));
    chart.draw(showThis, options);
   $(curTab).show();
}
});
}
function getAvail(d,u){
return u/(u+d);
}
function getPerf(p,t,u){
return (p*t)/u;
}
function getQual(p, s){
return (p-s)/p;
}
function getOEE(d,t,u,p,s){
var avail = getAvail(d,u);
var perf = getPerf(p,t,u);
var qual = getQual(p,s);
return avail*perf*qual;

asgallant

unread,
Jan 8, 2013, 12:08:41 PM1/8/13
to google-visua...@googlegroups.com
The "calc" parameter expects a function, and you are giving it the return value of a function.  So when you have "calc: getAvail(4,6)", it is equivalent to having "calc: 0.6".  Looking at your set up, it seems that what you want is this:

calc: function (dt, row) {
    return getAvail(dt.getValue(row, 4), dt.getValue(row, 6));

Michelle Stewart

unread,
Jan 8, 2013, 12:59:43 PM1/8/13
to google-visua...@googlegroups.com
That works beautifully!  Thank you!

asgallant

unread,
Jan 8, 2013, 2:06:28 PM1/8/13
to google-visua...@googlegroups.com
You're welcome.
Reply all
Reply to author
Forward
0 new messages