Custom Aggregate Functions (PSQL)

116 views
Skip to first unread message

Adriano dos Santos Fernandes

unread,
Apr 29, 2026, 6:45:25 AMApr 29
to firebir...@googlegroups.com
Custom aggregate functions extend `CREATE FUNCTION` with an `AGGREGATE`
form implemented in PSQL.

They allow users to define aggregate routines with their own per-group
or per-window-frame local state, step logic and final result calculation.

## Syntax

```sql
{ recreate | { create [or alter] } | alter } aggregate function
<function-name>
[ ( <input-parameter> [, <input-parameter> ...] ) ]
returns <return-type>
[sql security {invoker | definer}]
as
[<local declarations>]
begin
[on start do <statement>]

on step do <statement>

on finish do <statement>
end
```

Where:

- `ON START DO` is optional
- `ON STEP DO` is required
- `ON FINISH DO` is required

The sections must appear in this order: optional `ON START DO`, then `ON
STEP DO`, then `ON FINISH DO`.

## Execution Model

An aggregate function is invoked once for each aggregate instance:
- once per group in grouped aggregation
- once per window frame in windowed aggregation

For each instance:
1. A private execution context is allocated
2. Local variables declared in the routine body start as `NULL`
3. `ON START DO` is executed once, if present
4. `ON STEP DO` is executed for each qualifying row
5. `ON FINISH DO` is executed once and must return the aggregate result

Local variables declared in the routine body are visible to `ON START
DO`, `ON STEP DO` and `ON FINISH DO`, and persist between step calls for
the same aggregate instance.

## Invocation

Aggregate functions are called using normal function call syntax:

```sql
<aggregate-name>(<args>)
<aggregate-name>(<args>) filter (where <condition>)
<aggregate-name>(<args>) over (<window-specification>)
```

They are valid only in aggregate-capable SQL contexts, such as:
- the select list of aggregate queries
- `HAVING`
- `ORDER BY`
- windowed `OVER (...)` usage

They are not valid as scalar function calls.

## Notes

- Custom aggregate functions share the normal function namespace.

## Restrictions

Current restrictions:
- local sub-functions declared with `DECLARE FUNCTION` cannot be
aggregate functions
- `EXTERNAL` functions cannot be aggregate functions
- `DETERMINISTIC` is not allowed for aggregate functions

## Examples

### Sum Ignoring `NULL`

```sql
set term ^;

create or alter aggregate function sum_i(v integer)
returns bigint
as
declare total bigint;
declare seen boolean;
begin
on start do
begin
-- May be initialized inline within DECLARE as well
total = 0;
seen = false;
end

on step do
begin
if (v is not null) then
begin
total = total + v;
seen = true;
end
end

on finish do
begin
if (not seen) then
return null;

return total;
end
end^

set term ;^
```

Usage:

```sql
select sum_i(salary)
from employee;
```

### Count Non-`NULL` Values Without `ON START DO`

```sql
set term ^;

create or alter aggregate function count_not_null(v integer)
returns bigint
as
declare cnt bigint;
begin
on step do
begin
if (v is not null) then
begin
if (cnt is null) then
cnt = 0;

cnt = cnt + 1;
end
end

on finish do
begin
return coalesce(cnt, 0);
end
end^

set term ;^
```

Usage:

```sql
select department,
count_not_null(salary) as employees_with_salary
from employee
group by department;
```

### Window Usage

```sql
select department,
salary,
sum_i(salary) over (partition by department) as dept_sum
from employee;
```

### Using `FILTER`

```sql
select sum_i(amount) filter (where status = 'A') as approved_amount,
sum_i(amount) filter (where status = 'P') as pending_amount
from payments;
```


Adriano

Dimitry Sibiryakov

unread,
Apr 29, 2026, 7:30:51 AMApr 29
to firebir...@googlegroups.com
Adriano dos Santos Fernandes wrote 29.04.2026 12:45:
> Where:
>
> - `ON START DO` is optional
> - `ON STEP DO` is required
> - `ON FINISH DO` is required
>
> The sections must appear in this order: optional `ON START DO`, then `ON
> STEP DO`, then `ON FINISH DO`.

Why such restriction?

> ## Execution Model
>
> An aggregate function is invoked once for each aggregate instance:
> - once per group in grouped aggregation
> - once per window frame in windowed aggregation
>
> For each instance:
> 1. A private execution context is allocated
> 2. Local variables declared in the routine body start as `NULL`
> 3. `ON START DO` is executed once, if present
> 4. `ON STEP DO` is executed for each qualifying row
> 5. `ON FINISH DO` is executed once and must return the aggregate result

What happen to returned value assigned in other blocks?
What are values of parameters in ON START and ON FINISH blocks?

> Local variables declared in the routine body are visible to `ON START
> DO`, `ON STEP DO` and `ON FINISH DO`, and persist between step calls for
> the same aggregate instance.

Do they persist also between groups allowing "running total" kind of functions?

> - `EXTERNAL` functions cannot be aggregate functions

Why such restriction?

--
WBR, SD.

Denis Simonov

unread,
Apr 29, 2026, 12:52:53 PMApr 29
to firebird-devel
>  - `EXTERNAL` functions cannot be aggregate functions

I hope this is a temporary restriction. Its implementation has simply been delayed.

Can user-defined aggregate functions be used to calculate cumulative sums? That is, using `OVER(ORDER BY ...)`

Are there any restrictions on using the `WITHIN GROUP` clause for user-defined aggregate functions?

Am I correct in understanding that there shouldn't be any statements outside the `on start do`, `on step do`, and `on finish do` blocks?

Can user-defined aggregate functions be packaged?

среда, 29 апреля 2026 г. в 14:30:51 UTC+3, Dimitry Sibiryakov:

Mark Rotteveel

unread,
Apr 30, 2026, 11:39:48 AM (14 days ago) Apr 30
to firebir...@googlegroups.com
On 29-04-2026 12:45, Adriano dos Santos Fernandes wrote:
> Custom aggregate functions extend `CREATE FUNCTION` with an `AGGREGATE`
> form implemented in PSQL.
[..]
> ## Execution Model
>
> An aggregate function is invoked once for each aggregate instance:
> - once per group in grouped aggregation
> - once per window frame in windowed aggregation


My mental model conflicts with your use of "once" here. If I understand
correctly, it is invoked *for each* row (i.e. step) in the
group/partition, restoring the function state for that group/partition,
so use of once seems off to me.


> For each instance:
> 1. A private execution context is allocated
> 2. Local variables declared in the routine body start as `NULL`
> 3. `ON START DO` is executed once, if present

What are the values of the function parameters in ON START? Or are the
parameters not accessible then?

> 4. `ON STEP DO` is executed for each qualifying row

Why the term ON STEP, why not ON ROW, or maybe ON VALUE?

> 5. `ON FINISH DO` is executed once and must return the aggregate result

Same as for 3, What are the values of the function parameters in ON FINISH?

> Local variables declared in the routine body are visible to `ON START
> DO`, `ON STEP DO` and `ON FINISH DO`, and persist between step calls for
> the same aggregate instance.


See also my earlier comment regarding once.

> ## Invocation
>
> Aggregate functions are called using normal function call syntax:
>
> ```sql
> <aggregate-name>(<args>)
> <aggregate-name>(<args>) filter (where <condition>)
> <aggregate-name>(<args>) over (<window-specification>)
> ```
>
> They are valid only in aggregate-capable SQL contexts, such as:
> - the select list of aggregate queries
> - `HAVING`
> - `ORDER BY`
> - windowed `OVER (...)` usage


Do they work with window frames like BETWEEN UNBOUNDED PRECEDING AND
CURRENT ROW, or, say BETWEEN 10 PRECEDING AND 10 FOLLOWING, if so how?

Mark
--
Mark Rotteveel

Jim Starkey

unread,
Apr 30, 2026, 11:56:33 AM (14 days ago) Apr 30
to firebir...@googlegroups.com
I have a suggestion: Add a few more optional clauses:

    [ at top of <expression> do <statement> ]

    [ at bottom of <expression> do <statement> ]

where the clauses would be executed after and before, respectively, the
<expression> changes value.  The expression would generally reference
something in the sort order, but neither requirements nor the
implementation would require this.

The Amorphous Data Access Language has these in a generalize for-loop. 
For Firebird, I suspect that the control structure might be useful in
other contexts.
--
Jim Starkey

Denis Simonov

unread,
Apr 30, 2026, 12:02:22 PM (14 days ago) Apr 30
to firebird-devel


четверг, 30 апреля 2026 г. в 18:39:48 UTC+3, Mark Rotteveel:
I'm also interested in this. I looked at the Custom Aggregate implementation in other DBMSs, for example, Postgres https://www.postgresql.org/docs/current/xaggr.html

They separate the function for processing a new row and the function for removing the old row from the frame. Perhaps we should do this in a different syntax.

```
on [add] step do <statement>

on sub step do <statement>
```
 


Mark
--
Mark Rotteveel

Adriano dos Santos Fernandes

unread,
May 4, 2026, 8:14:50 PM (10 days ago) May 4
to firebir...@googlegroups.com
On 4/29/26 13:52, Denis Simonov wrote:
>>  - `EXTERNAL` functions cannot be aggregate functions
>
> I hope this is a temporary restriction. Its implementation has simply
> been delayed.
>

Yes.


> Can user-defined aggregate functions be used to calculate cumulative
> sums? That is, using `OVER(ORDER BY ...)`
>

it uses same mechanism as internal aggregate functions, so yes.


> Are there any restrictions on using the `WITHIN GROUP` clause for user-
> defined aggregate functions?
>

WITHIN GROUP is currently hard syntax for some functions.

AFAIK it just put order before aggregation, so the way I would see would
be to make it a soft syntax valid for all custom aggregate functions or
to the ones having some flag.


> Am I correct in understanding that there shouldn't be any statements
> outside the `on start do`, `on step do`, and `on finish do` blocks?
>

Yes.


> Can user-defined aggregate functions be packaged?
>

Yes.


Adriano

Adriano dos Santos Fernandes

unread,
May 4, 2026, 8:22:50 PM (10 days ago) May 4
to firebir...@googlegroups.com
On 4/30/26 13:02, Denis Simonov wrote:
>
>
> четверг, 30 апреля 2026 г. в 18:39:48 UTC+3, Mark Rotteveel:
>
> On 29-04-2026 12:45, Adriano dos Santos Fernandes wrote:
> > ## Invocation
> >
> > Aggregate functions are called using normal function call syntax:
> >
> > ```sql
> > <aggregate-name>(<args>)
> > <aggregate-name>(<args>) filter (where <condition>)
> > <aggregate-name>(<args>) over (<window-specification>)
> > ```
> >
> > They are valid only in aggregate-capable SQL contexts, such as:
> > - the select list of aggregate queries
> > - `HAVING`
> > - `ORDER BY`
> > - windowed `OVER (...)` usage
>
>
> Do they work with window frames like BETWEEN UNBOUNDED PRECEDING AND
> CURRENT ROW, or, say BETWEEN 10 PRECEDING AND 10 FOLLOWING, if so how?
>

Yes, like internal aggregate functions.

>
> I'm also interested in this. I looked at the Custom Aggregate
> implementation in other DBMSs, for example, Postgres https://
> www.postgresql.org/docs/current/xaggr.html
>
> They separate the function for processing a new row and the function for
> removing the old row from the frame. Perhaps we should do this in a
> different syntax.
>
> ```
> on [add] step do <statement>
>
> on sub step do <statement>
> ```
>  

That optimization could be planned to be an optional enhancement.

Currently our internal aggregate functions do not do this and are still
usable and nobody complained.


Adriano

Dimitry Sibiryakov

unread,
May 5, 2026, 8:02:15 AM (9 days ago) May 5
to firebir...@googlegroups.com
Adriano dos Santos Fernandes wrote 05.05.2026 2:14:
>> Am I correct in understanding that there shouldn't be any statements
>> outside the `on start do`, `on step do`, and `on finish do` blocks?
>>
> Yes.

Then what's the reason to gather functionality of three independent functions
into one?

--
WBR, SD.

Mark Rotteveel

unread,
May 5, 2026, 11:23:56 AM (9 days ago) May 5
to firebir...@googlegroups.com
I'd say to have a single unit that defines the whole thing, and not have
it scattered in three different things.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 5, 2026, 11:27:41 AM (9 days ago) May 5
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 05.05.2026 17:23:
>>    Then what's the reason to gather functionality of three independent
>> functions into one?
> I'd say to have a single unit that defines the whole thing, and not have it
> scattered in three different things.

Isn't it the purpose of packages?

--
WBR, SD.

Mark Rotteveel

unread,
May 5, 2026, 12:31:33 PM (9 days ago) May 5
to firebir...@googlegroups.com
1) A package is not an aggregate function
2) The user defines the package header *and* body

What you're suggesting would means that you essentially have a single
package header as if it's an interface and a lot of different bodies
with different names implementing that package body.

I'm not saying it wouldn't be possible, but it would be a big deviation
from how packages currently work.

That seems in my opinion like a worse deviation from that concept than
the proposed syntax for aggregate functions.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 5, 2026, 12:53:42 PM (9 days ago) May 5
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 05.05.2026 18:31:
> 1) A package is not an aggregate function
> 2) The user defines the package header *and* body
>
> What you're suggesting would means that you essentially have a single package
> header as if it's an interface and a lot of different bodies with different
> names implementing that package body.

No, what I'm suggesting is a simple declaration of a custom aggregate
function as a package instead of a strange compound PSQL function.
This package must simply have some functions with predefined names that will
be called instead of pieces of the function in current proposal.

Example from starting post is easily transformed into package:

create or alter package sum_i as begin
declare variable total bigint;
declare variable seen boolean;
procedure start();
procedure step(v integer);
function finish() returns integer;
end^

It should be not hard to make SQL parser to look for a package when it
encounters an unknown aggregate function.

Of course, it would require to implement package variables first.

> That seems in my opinion like a worse deviation from that concept than the
> proposed syntax for aggregate functions.

The concept is just a hack with pieces of a functions being called as if they
were independent subfunctions. It isn't even consistent with universal triggers
(that use about the same ideology: being called in different situation with
different outcomes according to INSERTING/UPDATING/DELETING flags).
As an Firebird user I would prefer Oracle-type of custom aggregates based on
SQL standard user-defined types. In the worst case - MS-type external
interfaces. But what is proposed in this topic look like another
not-like-any-other hack, aimed to easy implementation instead of usage.

--
WBR, SD.

Denis Simonov

unread,
May 5, 2026, 1:39:11 PM (9 days ago) May 5
to firebird-devel
What's proposed in this thread looks much better than what's available in Oracle, MSSQL, or Postgres. Although it does look unusual.


> Example from starting post is easily transformed into package:

> create or alter package sum_i as begin
> declare variable total bigint;
> declare variable seen boolean;
> procedure start();
> procedure step(v integer);
> function finish() returns integer;
> end^

Using a package as an aggregate function is nonsense. How can you distinguish a regular package from a package implementing an aggregate? Where are the aggregate function arguments passed here? A package isn't an interface implementation. For what you want to work, a package must be a type, not just a module with a set of procedures and functions.

вторник, 5 мая 2026 г. в 19:53:42 UTC+3, Dimitry Sibiryakov:

Dimitry Sibiryakov

unread,
May 5, 2026, 3:18:06 PM (9 days ago) May 5
to firebir...@googlegroups.com
Denis Simonov wrote 05.05.2026 19:39:
> What's proposed in this thread looks much better than what's available in Oracle, MSSQL, or Postgres.

In this case you see something that I don't. It is not as clear as Oracle and
MS implementations and not as versatile as Postrgres one.
And as I said it doesn't even consistent with other Firebird features.

> Using a package as an aggregate function is nonsense. How can you distinguish a
> regular package from a package implementing an aggregate?

Why it must be distinguished? Any package that fits interface requirements is
ok. If someone try to use package without finish() function, they may get a
clear error "Package XXX doesn't provide interfaces required for aggregate" and
that's all.

> Where are the aggregate function arguments passed here?

Exactly in the same place as in the starting message: as parameters of step()
function. And the result is the result of finish() function. These functions are
just separated "on start", "on step" and "on finish" blocks from the starting post.

> A package isn't an interface implementation.

What? Package IS a interface. And package body IS the implementation of this
interface. If you think different, I would like to see your definition of
"interface".

> For what you want to work, a package must be a type, not just a module with a set of procedures and functions.

Yes, and with all these constants, declared temporary tables, etc it finally
become a something useful, not just a mini-schema for routines only.

--
WBR, SD.

Denis Simonov

unread,
May 6, 2026, 2:40:30 AM (8 days ago) May 6
to firebird-devel


вторник, 5 мая 2026 г. в 22:18:06 UTC+3, Dimitry Sibiryakov:
Denis Simonov wrote 05.05.2026 19:39:

> Using a package as an aggregate function is nonsense. How can you distinguish a
> regular package from a package implementing an aggregate?

Why it must be distinguished? Any package that fits interface requirements is
ok. If someone try to use package without finish() function, they may get a
clear error "Package XXX doesn't provide interfaces required for aggregate" and
that's all.


By interface, I mean an abstract class with pure virtual methods (generalized for aggregates). These methods must be implemented in descendants (classes) for a specific aggregate function.

But we don't have object-oriented programming support in PSQL, so we have to come up with other solutions.
 
> Where are the aggregate function arguments passed here?

Exactly in the same place as in the starting message: as parameters of step()
function. And the result is the result of finish() function. These functions are
just separated "on start", "on step" and "on finish" blocks from the starting post.

> A package isn't an interface implementation.

What? Package IS a interface. And package body IS the implementation of this
interface. If you think different, I would like to see your definition of
"interface".

The user must explicitly declare their intentions in the code - "CREATE AGGREGATE FUNCTION", 
and not so that if they wanted to write a package, they accidentally named the function `step`, now it is perceived as an aggregate function.
 
> For what you want to work, a package must be a type, not just a module with a set of procedures and functions.

Yes, and with all these constants, declared temporary tables, etc it finally
become a something useful, not just a mini-schema for routines only.

You're misunderstanding packages. They should be understood as they were originally intended in Oracle. There, they're something like modules (a carbon copy of ADA). In a language more familiar to you, it's comparable to a unit in Pascal. It also has 'interface' and 'implimetation' sections.

So, if you add variables to a package, it will have state, but this state is shared across sessions. This means you can't create multiple instances of the same package with different states. Accordingly, in this example, your implementation would turn into a pumpkin.

SELECT sum_i(Field_1), sum_i(Field_2)
FROM T
 


--
WBR, SD.

Denis Simonov

unread,
May 6, 2026, 2:53:19 AM (8 days ago) May 6
to firebird-devel
> WITHIN GROUP is currently hard syntax for some functions.

> AFAIK it just put order before aggregation, so the way I would see would
> be to make it a soft syntax valid for all custom aggregate functions or
> to the ones having some flag.

This is indeed true.
Ordering isn't meaningful for every aggregate function, so it makes sense to introduce a special flag for those functions that can use the optional WITHIN GROUP clause. 
This flag is only intended for custom aggregates. For user-defined aggregate functions, limit the WITHIN GROUP functionality to only the ability 
to order the dataset within each group before aggregation. Special features, such as passing sorting items as an argument to the function, 
should be reserved for built-in aggregate functions. This isn't a requirement right now, but rather a suggestion for the future.

> That optimization could be planned to be an optional enhancement.

> Currently our internal aggregate functions do not do this and are still
> usable and nobody complained.

We have some optimizations to be made for the analytical functions for sliding windows. It's not exactly required, but it's worth keeping in mind for the future.

P.S. Regarding window frames, we still need to implement the EXCLUDE clause; that's the only part of the windowing that hasn't been done yet.

среда, 6 мая 2026 г. в 09:40:30 UTC+3, Denis Simonov:

Dimitry Sibiryakov

unread,
May 6, 2026, 4:36:12 AM (8 days ago) May 6
to firebir...@googlegroups.com
Denis Simonov wrote 06.05.2026 8:40:
> By interface, I mean an abstract class with pure virtual methods (generalized
> for aggregates). These methods must be implemented in descendants (classes) for
> a specific aggregate function.
>
> But we don't have object-oriented programming support in PSQL, so we have to
> come up with other solutions.

Pay attention to C++ Named Requirements:
https://en.cppreference.com/cpp/named_req
They is an example of interfaces that doesn't require abstract classes and
derivation, they just name requirements like in our case is "to have these methods".

> The user must explicitly declare their intentions in the code - "CREATE AGGREGATE FUNCTION",
> and not so that if they wanted to write a package, they accidentally named the function `step`, now it is perceived as an aggregate function.

Agree.
What I disagree is just a strange design with "on XXX" blocks that caused my
and Mark's initial questions. IMHO it should be either split into several
routines (and this is where packages are useful) or approach similar to
universal trigger should be used:

if (init) then ...
else if (step) then ...
else if (finish) then ...

The latter is simpler.

> You're misunderstanding packages. They should be understood as they were originally intended in Oracle. There, they're something like modules (a carbon copy of ADA). In a language more familiar to you, it's comparable to a unit in Pascal. It also has 'interface' and 'implimetation' sections.
>
> So, if you add variables to a package, it will have state, but this state is shared across sessions. This means you can't create multiple instances of the same package with different states.

No, the state isn't shared across sessions. Oracle packages are
"session-local". It is enough for them because Oracle doesn't allow multiple
transactions per session. "Database-local" packages would be indeed not
distinguishable from schemas.
Firebird can make a step further and invent "transaction-local",
"request-local" or "field-local" package instances that can make them suitable
for implementation of interface for aggregate functions.

--
WBR, SD.

Denis Simonov

unread,
May 6, 2026, 5:14:52 AM (8 days ago) May 6
to firebird-devel
You're right, in Oracle, package states are session-local. This is a translation error. But I don't think it's worth changing, as it will break people's minds.

In the proposed syntax, the on init, on step and on finish blocks are labels. And the aggregate function itself is a state machine. 
Roughly speaking, we can consider the state parameter to be implicitly passed. Then it can be logically represented as follows:

switch (state)
{
    case INIT:
    {
        // executed at the beginning of each data extraction group/frame
        // used to initialize the initial value
        break;
    }

    case STEP:
    {
        // executed for each row in the group. The aggregation argument must be processed here
        break;
    }

    case FINISH:
    {
        // executed when each group/frame has run out of data
        // generate the final result from the accumulator variables
        break;
    }
}


I don't know what could be unclear here.

среда, 6 мая 2026 г. в 11:36:12 UTC+3, Dimitry Sibiryakov:

Dimitry Sibiryakov

unread,
May 6, 2026, 5:32:25 AM (8 days ago) May 6
to firebir...@googlegroups.com
Denis Simonov wrote 06.05.2026 11:14:
> You're right, in Oracle, package states are session-local. This is a translation
> error. But I don't think it's worth changing, as it will break people's minds.

It depends. Currently Firebird packages are database-local, they don't have
state at all.
But in contrast to Oracle, Firebird allows parallel transactions and if
packages that are used in these transactions share state it could make them less
useful.
Consider, for example package UTL_HTTP: in several transactions you send
several different requests and then wait for responses to parse in parallel.
Doing so in a single transaction sequentially would make process slower. Doing
so in separate attachments would make process less convenient. But the package
needs to keep state for each request individually. Of course, it may be
workarounded, but such workaround would make implementation more complicated.

> In the proposed syntax, the on init, on step and on finish blocks are labels.
> And the aggregate function itself is a state machine.
> Roughly speaking, we can consider the state parameter to be implicitly passed.
> Then it can be logically represented as follow

Yes, that's what I was talking about comparing these aggregate functions with
universal triggers. In the triggers we have INSERTING/UPDATING/DELETING. In the
functions we can have INIT/STEP/FINISH with the same usage pattern.

--
WBR, SD.

Adriano dos Santos Fernandes

unread,
May 6, 2026, 9:32:27 PM (8 days ago) May 6
to firebir...@googlegroups.com
Updated syntax proposal:

{ recreate | { create [or alter] } | alter } aggregate function
<function-name>
[ ( <input-parameter> [, <input-parameter> ...] ) ]
returns <return-type>
[sql security {invoker | definer}]
as
[<local declarations>]
begin
[on start do <statement>]

on accumulate do <statement>

on group do <statement>

[on finish do <statement>]
end

ON START / ON FINISH runs once per instance.

An aggregate would run a series of ON ACCUMULATE, then ON GROUP.

ON GROUP is who returns the value with `RETURN <value>`.

Others sections may use `EXIT` but not `RETURN`.

For window functions, ON ACCUMULATE may continue to run after ON GROUP.
For example:

custom_sum(1) over (order by n) for data (1, 1, 2, 3):

on_start
on_accumulate
on_accumulate
on_group - returns 2
on_accumulate
on_group - returns 3
on_accumulate
on_group - returns 4
on_finish

In the future "revert accumulation" op may exist for optimal performance
in window functions, so rows from a frame may be removed while the frame
is sliding.


Adriano

Mark Rotteveel

unread,
May 7, 2026, 3:02:51 AM (7 days ago) May 7
to firebir...@googlegroups.com
On 07-05-2026 03:32, Adriano dos Santos Fernandes wrote:
> Updated syntax proposal:
>
> { recreate | { create [or alter] } | alter } aggregate function
> <function-name>
> [ ( <input-parameter> [, <input-parameter> ...] ) ]
> returns <return-type>
> [sql security {invoker | definer}]
> as
> [<local declarations>]
> begin
> [on start do <statement>]
>
> on accumulate do <statement>
>
> on group do <statement>
>
> [on finish do <statement>]
> end
>
> ON START / ON FINISH runs once per instance.

And the instance is a (fixed) window frame or group, right?

> An aggregate would run a series of ON ACCUMULATE, then ON GROUP.
>
> ON GROUP is who returns the value with `RETURN <value>`.
>
> Others sections may use `EXIT` but not `RETURN`.
>
> For window functions, ON ACCUMULATE may continue to run after ON GROUP.
> For example:
>
> custom_sum(1) over (order by n) for data (1, 1, 2, 3):
>
[..]
>
> In the future "revert accumulation" op may exist for optimal performance
> in window functions, so rows from a frame may be removed while the frame
> is sliding.
I'd suggest to use "ON ADD" instead of "ON ACCUMULATE" as the reasonable
opposite is then "ON REMOVE", while ACCUMULATE doesn't have an obvious
antonym (see https://www.merriam-webster.com/thesaurus/accumulate).
Decrease comes close, but is IMHO not very obvious in this context,
while add/remove are pretty self-explanatory in my view.

What is (or can be) the purpose of "ON FINISH" if "ON GROUP" is what
returns a result?

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 7, 2026, 3:07:10 AM (7 days ago) May 7
to firebir...@googlegroups.com
Adriano dos Santos Fernandes wrote 07.05.2026 3:32:
> Updated syntax proposal:

Why do you reject suggestion to use the same syntax as in universal trigger?
It is expandable and don't require new BLR verbs.

--
WBR, SD.

Mark Rotteveel

unread,
May 7, 2026, 3:25:13 AM (7 days ago) May 7
to firebir...@googlegroups.com
Contrary to triggers, these functions must do certain things for them to
work. Enforcing that in its syntax sounds like a good idea to me. Also,
that things have a specific PSQL syntax doesn't (or IMO, shouldn't)
automatically mean they require new BLR verbs.

Mark
--
Mark Rotteveel

Dimitry Sibiryakov

unread,
May 7, 2026, 3:52:39 AM (7 days ago) May 7
to firebir...@googlegroups.com
'Mark Rotteveel' via firebird-devel wrote 07.05.2026 9:25:
> Contrary to triggers, these functions must do certain things for them to work.
"...otherwise they won't work" and IMHO it is enough level of enforcing.

State variables provides user freedom in designing of the execution flow and
allow, for example, to share pieces of code between aggregation phases without
using of subroutines.


--
WBR, SD.

Dmitry Yemanov

unread,
May 7, 2026, 3:55:29 AM (7 days ago) May 7
to firebir...@googlegroups.com
07.05.2026 04:32, Adriano dos Santos Fernandes wrote:
> Updated syntax proposal:
>
> { recreate | { create [or alter] } | alter } aggregate function
> <function-name>
> [ ( <input-parameter> [, <input-parameter> ...] ) ]
> returns <return-type>
> [sql security {invoker | definer}]
> as
> [<local declarations>]
> begin
> [on start do <statement>]
>
> on accumulate do <statement>
>
> on group do <statement>
>
> [on finish do <statement>]
> end

Maybe we could combine the original idea with the package-based approach
by Dimitry Sibiryakov the following way:

{ recreate | { create [or alter] } | alter } aggregate function
<function-name>
[ ( <input-parameter> [, <input-parameter> ...] ) ]
returns <return-type>
[sql security {invoker | definer}]
as
declare on start handler
as
begin
end

declare on group handler
as
begin
end

begin
end

I.e. replace predefined package routines with sub-routines explicitly
declared as handlers. Given that sub-routines have access to parent
parameters, it should work the same way as with DO <statement> blocks.

Of course it could be done using ON START DO CALL MYPROC1 etc, but maybe
the explicit syntax for state handlers would be taken as a better option.

Not insisting, just an idea.


Dmitry

Dimitry Sibiryakov

unread,
May 7, 2026, 4:06:39 AM (7 days ago) May 7
to firebir...@googlegroups.com
Dmitry Yemanov wrote 07.05.2026 9:55:
> Maybe we could combine the original idea with the package-based approach by
> Dimitry Sibiryakov the following way:

It still leaves an open question "what the purpose of "main" begin-end block
if code inside of it is never executed".
Not a big deal if documented, of course, but make a bad impression.

--
WBR, SD.

Dmitry Yemanov

unread,
May 7, 2026, 4:11:59 AM (7 days ago) May 7
to firebir...@googlegroups.com
create aggregate function ...
as
on start handler
as begin ... end

on group handler
as begin ... end
;


Dmitry

Dimitry Sibiryakov

unread,
May 7, 2026, 4:14:21 AM (7 days ago) May 7
to firebir...@googlegroups.com
Dmitry Yemanov wrote 07.05.2026 10:11:
> create aggregate function ...
> as
>   on start handler
>   as begin ... end
>
>   on group handler
>   as begin ... end
> ;

That's good. Will require a new system table or somehow can fit to existing
RDB$FUNCTIONS?

--
WBR, SD.

Dmitry Yemanov

unread,
May 7, 2026, 4:22:22 AM (7 days ago) May 7
to firebir...@googlegroups.com
07.05.2026 11:14, 'Dimitry Sibiryakov' via firebird-devel wrote:
>
>> create aggregate function ...
>> as
>>    on start handler
>>    as begin ... end
>>
>>    on group handler
>>    as begin ... end
>> ;
>
>   That's good. Will require a new system table or somehow can fit to
> existing RDB$FUNCTIONS?

If handlers will be BLR-encoded as implicit sub-routines, then I don't
foresee any need for schema changes.


Dmitry

Adriano dos Santos Fernandes

unread,
May 7, 2026, 5:45:31 AM (7 days ago) May 7
to firebir...@googlegroups.com
On 5/7/26 04:02, 'Mark Rotteveel' via firebird-devel wrote:
> On 07-05-2026 03:32, Adriano dos Santos Fernandes wrote:
>> Updated syntax proposal:
>>
>> { recreate | { create [or alter] } | alter } aggregate function
>> <function-name>
>>      [ ( <input-parameter> [, <input-parameter> ...] ) ]
>>      returns <return-type>
>>      [sql security {invoker | definer}]
>> as
>>      [<local declarations>]
>> begin
>>      [on start do <statement>]
>>
>>      on accumulate do <statement>
>>
>>      on group do <statement>
>>
>>      [on finish do <statement>]
>> end
>>
>> ON START / ON FINISH runs once per instance.
>
> And the instance is a (fixed) window frame or group, right?
>

In the case of OVER (ORDER BY ...), the window bottom frame slides, it's
why it allows new rows after a group is returned.


>> An aggregate would run a series of ON ACCUMULATE, then ON GROUP.
>>
>> ON GROUP is who returns the value with `RETURN <value>`.
>>
>> Others sections may use `EXIT` but not `RETURN`.
>>
>> For window functions, ON ACCUMULATE may continue to run after ON GROUP.
>> For example:
>>
>> custom_sum(1) over (order by n) for data (1, 1, 2, 3):
>>
> [..]
>>
>> In the future "revert accumulation" op may exist for optimal performance
>> in window functions, so rows from a frame may be removed while the frame
>> is sliding.
> I'd suggest to use "ON ADD" instead of "ON ACCUMULATE" as the reasonable
> opposite is then "ON REMOVE", while ACCUMULATE doesn't have an obvious
> antonym (see https://www.merriam-webster.com/thesaurus/accumulate).
> Decrease comes close, but is IMHO not very obvious in this context,
> while add/remove are pretty self-explanatory in my view.
>
> What is (or can be) the purpose of "ON FINISH" if "ON GROUP" is what
> returns a result?
>

Because ON GROUP may be called more than one time. ON FINISH is optional
cleanup.


Adriano

Denis Simonov

unread,
May 7, 2026, 7:22:48 AM (7 days ago) May 7
to firebird-devel
Is ACCUMULATE a keyword or a reserved word?

Is it possible to initialize initial values ​​through input arguments if a literal value is passed in?

For example:

recreate aggregate function sum_i(i int, s int)
returns int
as
  declare acc int;
begin
  on start do
    acc = s;

  on accumulate do
    acc = acc + i;

  on group do
     return acc;
end

select sum_i(field, 4) from t;


четверг, 7 мая 2026 г. в 12:45:31 UTC+3, Adriano dos Santos Fernandes:

Adriano dos Santos Fernandes

unread,
May 7, 2026, 7:55:52 AM (7 days ago) May 7
to firebir...@googlegroups.com
On 5/7/26 08:22, Denis Simonov wrote:
> Is ACCUMULATE a keyword or a reserved word?
>

Non-reserved word.


> Is it possible to initialize initial values ​​through input arguments if
> a literal value is passed in?
>

Yes.


Adriano

Reply all
Reply to author
Forward
0 new messages