new version of beanquery

178 views
Skip to first unread message

Chary Chary

unread,
Sep 5, 2024, 11:08:13 AM9/5/24
to Beancount

I have been looking at the latest version of beanquery as well as messages here and as far as I understand, the only currently available beanquery document is quite out of date with the new features.


Just want to double check my understanding


The documentation says, that 


since we have a single table of data, we replace the table name in FROM by a filtering expression which applies over transactions, and the WHERE clause applies to data pulled from the resulting list of postings:

SELECT <target1>, <target2>, …
FROM <entry-filter-expression>
WHERE <posting-filter-expression>;

Both filtering expressions are optional. If no filtering expressions are provided, all postings will be enumerated over. Note that since the transactions are always filtered in date order, the results will be processed and returned in this order by default.



As far as I can see, there are more than one tables available now (not only postings)


beanquery> .tables

accounts

balances

commodities

documents

entries

events

notes

postings

prices

transactions


Now the FROM statement needs to contain table, but preceded by #

e.g.

select date, narration from #transactions


If FROM is omitted, then it is assumed to be FROM #postings


Question: Is this correct?


posting has also all the transaction information, but it is called entry there


 beanquery> .describe postings     

table postings:

  id (str)

  type (str)

  filename (str)

  lineno (int)

  date (date)

  year (int)

  month (int)

  day (int)

  flag (str)

  payee (str)

  narration (str)

  description (str)

  tags (set)

  links (set)

  meta (dict)

  location (str)

  posting_flag (str)

  account (str)

  other_accounts (set)

  number (decimal)

  currency (str)

  cost_number (decimal)

  cost_currency (str)

  cost_date (date)

  cost_label (str)

  position (position)

  price (amount)

  weight (amount)

  balance (inventory)

  entry (transaction)


beanquery> .describe transactions

table transactions:

  meta (metadata)

  date (date)

  flag (str)

  payee (str)

  narration (str)

  tags (frozenset)

  links (frozenset)

beanquery>


So, 

select date, account, narration

Gives the same result as


select entry.date, account, entry.narration


Question: why is it done like this? What is the reason to have transaction information (e.g. date ) both in posting as well as in posting.entry? 

So far the only place I use it is to access transaction meta as an opposite to posting meta


select date, account, narration, entry.meta as transaction_meta, meta as posting_meta


Question: is it correct, that it is still not possible to create queries, which join tables with each other like in traditional SQL?


Say for every posting I want to list an account as well as the date the account was open.


select #postings.account,  #accounts.open where #postings.account = #accounts.account


I know above does not work, but is something like this possible?

Daniele Nicolodi

unread,
Sep 5, 2024, 12:19:37 PM9/5/24
to bean...@googlegroups.com
On 05/09/24 17:08, Chary Chary wrote:
> I have been looking at the latest version of beanquery as well as
> messages here and as far as I understand, the only currently available
> beanquery document is quite out of date with the new features.

Yes, writing documentation is high on the to-do list.

> since we have a single table of data, we replace the table name in FROM
> by a filtering expression which applies over transactions, and the WHERE
> clause applies to data pulled from the resulting list of postings:
>
> SELECT <target1>, <target2>, …
> FROM <entry-filter-expression>
> WHERE <posting-filter-expression>;
>
> Both filtering expressions are optional. If no filtering expressions are
> provided, all postings will be enumerated over. Note that since the
> transactions are always filtered in date order, the results will be
> processed and returned in this order by default.

This was the old way of doing things, indeed, and it is still supported.

However, it turns out that splitting the row filtering into an
entry-filter and a posting-filter is unnecessary and it complicates both
the understanding and the implementation.

> As far as I can see, there are more than one tables available now (not
> only postings)

Correct.

> Now the FROM statement needs to contain table, but preceded by #
>
> e.g.
>
> select date, narration from #transactions
>
> If FROM is omitted, then it is assumed to be FROM #postings

Correct, but incomplete. Using a entry-filter expression is still
supported for backward compatibility. Backward compatibility is the
reason why table names in the FROM clause need to be prefixed with "#":

> posting has also all the transaction information, but it is called entry
> there

It is called entry because there is an "entry_meta()" function that is
equivalent to "entry.meta[]" and I wanted a similar name. Other names
considered where "transaction", "txn", and "parent", I thought the
latter was particularly interesting because it can be generalized to
apply to all objects contained in another. If there is strong preference
for another name, it is very easy to add an alias and eventually
deprecated the less favored name.

> So,
>
> select date, account, narration
>
> Gives the same result as
>
> select entry.date, account, entry.narration
>
> Question: why is it done like this? What is the reason to have
> transaction information (e.g. date ) both in posting as well as in
> posting.entry?

Just backward compatibility. The old way of doing things will be
deprecate and eventually removed to make the data model more consistent
and "orthogonal". However, the capability of accessing fields of
structured types in beanquery has only beed recently (compared to how
much time I can dedicate to the project) introduced, and before starting
to emit deprecation warnings I want to have some documentation in place,
and I haven't had time to work on the documentation. Thus for now we
live with the duplication.

> Question: is it correct, that it is still not possible to create
> queries, which join tables with each other like in traditional SQL?

Implementing joins is also very high on the to-do list, but it has not
been done yet. I would like to implement joins as part of a refactoring
of the beanquery evaluation engine, and big plans tend to be put aside
when there is not enough time to dedicate to them...

> Say for every posting I want to list an account as well as the date the
> account was open.
>
> select #postings.account, #accounts.open where #postings.account =
> #accounts.account
>
> I know above does not work, but is something like this possible?

In this specific case you can use a specialized function that fetches
the information you are after:

SELECT account, open_date(account) FROM #postings

There are several similar function in BQL that have been introduced to
work-around limitations of the query engine. Funnily, some of the
refactoring that would make the query engine more flexible is
complicated by keeping support for these features. But I think backward
compatibility is more important and we have been striving to maintain
it, unless there were strong reasons for not doing so.

Cheers,
Dan

Chary Chary

unread,
Sep 5, 2024, 12:55:24 PM9/5/24
to Beancount
Dan,

thanks! it is clear.

beanquery has definately made a big step forward!

Chary Chary

unread,
Sep 5, 2024, 1:29:06 PM9/5/24
to Beancount
Another small question:


On Thursday, September 5, 2024 at 6:19:37 PM UTC+2 dan...@grinta.net wrote:

> So,
>
> select date, account, narration
>
> Gives the same result as
>
> select entry.date, account, entry.narration
>
> Question: why is it done like this? What is the reason to have
> transaction information (e.g. date ) both in posting as well as in
> posting.entry?

Just backward compatibility. The old way of doing things will be
deprecate and eventually removed to make the data model more consistent
and "orthogonal". However, the capability of accessing fields of
structured types in beanquery has only beed recently (compared to how
much time I can dedicate to the project) introduced, and before starting
to emit deprecation warnings I want to have some documentation in place,
and I haven't had time to work on the documentation. Thus for now we
live with the duplication.



So, the long term plan is to support only this notation:

select entry.date, account, entry.narration

and drop down support for this notation:

select date, account, narration

Is this correct?
 

 

Martin Blais

unread,
Sep 8, 2024, 3:07:03 PM9/8/24
to bean...@googlegroups.com
Yep, Dan's been doing a great job with beanquery!

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/b5494278-ebe9-4033-918c-c2fa93060ebdn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages