Generic arrayReduce or fold

381 views
Skip to first unread message

Tobia

unread,
Mar 8, 2017, 4:26:43 AM3/8/17
to ClickHouse
Hello

I was looking through the extensive set of array manipulation functions, but I couldn't find a generic reduce or fold combinator.

There is arrayReduce(), which is briefly mentioned under arrayUniq(), but although it's not properly documented, it seems to only accept one of the predefined aggregate functions, not a lambda.

I thought I could use something like this:

  arrayReduce(func, initialValue, arr1, ...)

where the lambda would accept the current accumulator, the array elements, and return the new accumulator value:

  (currentValue, elem1, ...) -> newValue

Does it exist? This would be very useful for writing all sorts of algorithms.

For example, I just wrote a query to count the number of days in an array of date intervals (beginDate, endDate) without counting duplicate days when some intervals are overlapping. This would have been much easier (and probably faster) if I could use a fold combinator, because I would be able to keep an accumulator tuple with (running total of days, maximum date seen so far).

As another example, how would you compute the product of an array of numbers? This would be trivial with fold or reduce:

  arrayReduce((acc, x) -> acc * x, 1, numberArray)


-Tobia

man...@gmail.com

unread,
Oct 23, 2017, 1:56:45 PM10/23/17
to ClickHouse
Hello.

We initially had intent to implement it, but it was difficult to implement in vectorized execution engine.

Each "ordinary" function in ClickHouse process data not by single values but by arrays.
All functions are internally called with bunches of values, and if you will call for single value, calculation will be very inefficient.
But if you want to implement reduce, you pass a lambda that looks like doing accumulation by processing single values.

This limitation can be overcomed with runtime code generation, and we don't have it for "ordinary" functions.
Reply all
Reply to author
Forward
0 new messages