As you might know, I'm working on a secondary indexing module.
It allows you to create indexes that can either be written to directly as a data type, or automatically index HASH objects as if they were table rows, and then retrieve them based on a WHERE clause.
I've just added an aggregation engine to this module. The idea is that if you have inline data in the index (say you index name, age height - they are all tightly packed into the index and can be retrieved efficiently), you can also create aggregative queries on it.
So for example with the index above, I can get a histogram of people's heights, or average height by age, etc.
Here is a demo of it running with the secondary index module:
Right now it's still very basic, and supports 3 functions - average of a series of numbers, sum of a series, and count_distinct of a series of strings or numbers. But I'll add many more functions soon.
The important thing that might interest members of this group, is that I'm going to decouple the engine from the module, and it can be used in any module to aggregate any data series.
So it can be used for the Graph module, the Time Series module, the Search module, etc etc. It includes a basic type system, a framework to add functions, and a parser that creates aggregation pipelines from a simple grammar.
I plan on opening the source of the library and module next week.