bean-query transaction grouping for render?

29 views
Skip to first unread message

Eric Altendorf

unread,
Aug 3, 2023, 2:28:08 AM8/3/23
to bean...@googlegroups.com
When you do something like
  SELECT account, SUM(position) ...

you get a nice table where the list of positions for each account is grouped (with expand=true, it renders like a nested set of rows within the row) and each account name is only printed once.

When you do something like
  SELECT id, date, narration, account, cost(position), units(position)...

You get one row per posting, and one transaction's information (e.g., ie, date, narration) gets repeated for each row.  Visually this just makes it hard to see the grouping of postings in a transaction.

Is there a way to visually group postings in the same transaction, the way aggregated data is grouped when you do the SUM()?  I'm not sure where this logic would belong; one could have a generic string CONCAT() aggregation operator I suppose, but that doesn't seem right, or one could hack it as an option into render_text(), but that also seems awkward.

thanks,
eric

Daniele Nicolodi

unread,
Aug 3, 2023, 7:25:42 AM8/3/23
to bean...@googlegroups.com
On 03/08/23 08:27, Eric Altendorf wrote:
> When you do something like
>   SELECT account, SUM(position) ...
>
> you get a nice table where the list of positions for each account is
> grouped (with expand=true, it renders like a nested set of rows within
> the row) and each account name is only printed once.

This is because the query compiler infers that you are running an
aggregate query and thus adds an implicit 'GROUP BY account' to it.

expand=true changes the representation of the inventory type returned by
sum(position) from an horizontal one to a vertical one, but does not
introduce any notion of grouping.

> When you do something like
>   SELECT id, date, narration, account, cost(position), units(position)...
>
> You get one row per posting, and one transaction's information (e.g.,
> ie, date, narration) gets repeated for each row.  Visually this just
> makes it hard to see the grouping of postings in a transaction.

A SELECT without an explicit FROM clause specifying a table is
implicitly run over the #postings table, which is just a concatenation
of the posting in all the transactions in the ledger.

> Is there a way to visually group postings in the same transaction, the
> way aggregated data is grouped when you do the SUM()?  I'm not sure
> where this logic would belong; one could have a generic string CONCAT()
> aggregation operator I suppose, but that doesn't seem right, or one
> could hack it as an option into render_text(), but that also seems awkward.

The SELECT results set does not have any knowledge that the values in
the columns belong to postings and that these postings belong to
transactions. I don't see a way to implement what you are asking.

To customize the rendering of the results set, you can implement your
own beanquery.query_render.render_text() function. Postings are always
iterated grouped by transaction, thus if you know that you are always
going to have the 'id' column in the results set, you can just emit an
empty line every time the value in this column changes.

However, if you want to see postings grouped by transaction, maybe the
PRINT statement is what you want, instead of the SELECT statement.

Cheers,
Dan

Eric Altendorf

unread,
Aug 4, 2023, 1:40:18 AM8/4/23
to bean...@googlegroups.com
On Thu, Aug 3, 2023 at 4:25 AM Daniele Nicolodi <dan...@grinta.net> wrote:
On 03/08/23 08:27, Eric Altendorf wrote:
> When you do something like
>    SELECT account, SUM(position) ...
>
> you get a nice table where the list of positions for each account is
> grouped (with expand=true, it renders like a nested set of rows within
> the row) and each account name is only printed once.

This is because the query compiler infers that you are running an
aggregate query and thus adds an implicit 'GROUP BY account' to it.

Thanks, that's what I thought aggregation was doing but I wasn't sure.
 
expand=true changes the representation of the inventory type returned by
sum(position) from an horizontal one to a vertical one, but does not
introduce any notion of grouping.

Right, I figured that was just a rendering feature.
 
> When you do something like
>    SELECT id, date, narration, account, cost(position), units(position)...
>
> You get one row per posting, and one transaction's information (e.g.,
> ie, date, narration) gets repeated for each row.  Visually this just
> makes it hard to see the grouping of postings in a transaction.

A SELECT without an explicit FROM clause specifying a table is
implicitly run over the #postings table, which is just a concatenation
of the posting in all the transactions in the ledger.

Are you implying that if an explicit FROM clause specifies something else?  I thought all SELECT queries were selecting from the postings table.  In my case, I do happen to have a FROM clause, the full query is (now) more like:

SELECT date, narration, account, units(position) as amount, round(number(cost(position)) / number(units(position)), 4) as cost_each, cost(position) as total_cost FROM has_account("PnL") and quarter(date) = "2021-Q1"

> Is there a way to visually group postings in the same transaction, the
> way aggregated data is grouped when you do the SUM()?  I'm not sure
> where this logic would belong; one could have a generic string CONCAT()
> aggregation operator I suppose, but that doesn't seem right, or one
> could hack it as an option into render_text(), but that also seems awkward.

The SELECT results set does not have any knowledge that the values in
the columns belong to postings and that these postings belong to
transactions. I don't see a way to implement what you are asking.

Yeah, I looked at the code and I think the cleanest way to do this would be to have the query engine, in addition to returning the dtype for each column, return a flag marking whether the column is sourced from a transaction or a posting.  This could be a separate flag or maybe you could even work it into the existing `dtype` interface with a Generic as a marker type, e.g., instead of putting `datetime.date` in dtype, since it comes from the Transaction, you put in `TxCol[datetime.date]` for some `TxCol` wrapper/marker Generic.  (But I'm guessing there, I'm not a Python expert.)  The ColumnRenderer base class could unpack the nested type and dispatch as normal, but also set a flag saying that column renderers for TxCol types are eligible for ditto behavior.  Then, render_text() could take an "enable ditto" argument, and upon rendering, any ditto columns are only rendered in full in their first seen row, otherwise they are rendered as a ditto mark (or blank).

I don't think this would be horrible, but I would also understand if you don't like the idea :)

To customize the rendering of the results set, you can implement your
own beanquery.query_render.render_text() function. Postings are always
iterated grouped by transaction, thus if you know that you are always
going to have the 'id' column in the results set, you can just emit an
empty line every time the value in this column changes.

Thanks, I had taken a quick look there, but I'll look more closely.

However, if you want to see postings grouped by transaction, maybe the
PRINT statement is what you want, instead of the SELECT statement.

I don't think that will let me do what I want, but it's a good point I'd forgotten about, so thanks for reminding me of that option.

thanks,
eric


 

Cheers,
Dan

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/fd01ff7f-7bae-bf87-fc62-4f900f5fbd2c%40grinta.net.
Reply all
Reply to author
Forward
0 new messages