[ANN] beangrep - grep-like filter for Beancount

158 views
Skip to first unread message

Stefano Zacchiroli

unread,
May 12, 2024, 8:35:00 AMMay 12
to bean...@googlegroups.com
Hello beancounters, I've just released a little tool that I've needed
for quite a while but didn't have[^]: beangrep, a grep-like filter for
Beancount ledgers.

[^]: with the exception of grep.py from beanlabs at
https://github.com/beancount/beanlabs/blob/master/beanlabs/scripts/grep.py
which is more limited than my needs.

It is meant to be way simpler than beanquery, but is super useful for
quick queries via the CLI.

Beangrep is available at: https://github.com/zacchiro/beangrep
Its README is also attached to this email.

The tool is almost feature complete for me, so it doesn't really have a
roadmap. But I welcome feedback and suggestions for improvements (or,
even better, patches!), that I'll be happy to consider. It is also not
uploaded to pypi yet, but if that's useful for others I'll be happy to
take care of that too.

Thanks for all the beans!
Cheers
--
Stefano Zacchiroli . za...@upsilon.cc . https://upsilon.cc/zack _. ^ ._
Full professor of Computer Science o o o \/|V|\/
Télécom Paris, Polytechnic Institute of Paris o o o </> <\>
Co-founder & CTO Software Heritage o o o o /\|^|/\
https://twitter.com/zacchiro . https://mastodon.xyz/@zacchiro '" V "'
README.md

Justus Pendleton

unread,
May 13, 2024, 10:36:15 PMMay 13
to Beancount
This is pretty great! I often need to find some previous transaction and don't remember where it is across multiple beancount files. I'll do a grep, which really just tells me which file(s) and line(s) to look at. Then I need to switch to an editor to actually see more context.

Anyway, a few thoughts from using it for a few minutes.

- Maybe make -s/--somewhere/--anywhere a flag-less default so you can use it more like regular grep? That is: bean-grep foo my.beancount is equivalent to bean-grep -s foo my.beancount
- A way to filter out transactions from closed accounts? Maybe even make that the default?

It feels like it might be nice to have it auto-infer the type of search. 2002-12-30 means --date. #tag means --tag. I think you could do it for --meta, too? But maybe there's no good way to handle --account, --payee, and --narration without just turning it into --anywhere.

Stefano Zacchiroli

unread,
May 14, 2024, 4:18:21 AMMay 14
to bean...@googlegroups.com
Thanks for your feedback Justus.

I've noted down your suggestions as issues here
https://github.com/zacchiro/beangrep/issues

The first one (making --somewhere a flag-less default) is something I've
played with already. I ended up not including it, because it was a bit
tricky to detect the "no criteria given" situation, but I'll give it
another stab.

The second one (auto-detect predicate type where possible) is a great
idea. A sweet spot need to be found to avoid making the heuristic over
zealous (e.g., you want to detect 2024-05-14 and 2024-05 as dates, but
probably not 2734, which is more likely to be an amount), but I'll give
it a try.

Also remember: patches welcome ;-))

Meanwhile, I've uploaded beangrep to pypi, so it can now be installed
with a simple "pip install beangrep".

Cheers
> --
> You received this message because you are subscribed to the Google Groups "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/1a6bab93-fa16-4271-8382-b55c70a19d07n%40googlegroups.com.

Stefano Zacchiroli

unread,
May 15, 2024, 12:27:58 PMMay 15
to bean...@googlegroups.com
On Mon, May 13, 2024 at 07:36:15PM -0700, Justus Pendleton wrote:
> - Maybe make -s/--somewhere/--anywhere a flag-less default so you can use
> it more like regular grep? That is: bean-grep foo my.beancount is
> equivalent to bean-grep -s foo my.beancount
> - A way to filter out transactions from closed accounts? Maybe even make
> that the default?
>
> It feels like it might be nice to have it auto-infer the type of search.
> 2002-12-30 means --date. #tag means --tag. I think you could do it for
> --meta, too? But maybe there's no good way to handle --account, --payee,
> and --narration without just turning it into --anywhere.

Both done in version 0.3.0. They provide a much nicer user experience,
thanks for the idea! In terms of what is auto-detectable, I stopped at:
(full!) dates, #tag, ^link, and key:val for metadata. I don't think one
can go much further than that without usability issues, but shout if you
have other ideas.

Martin Michlmayr

unread,
May 16, 2024, 2:50:13 AMMay 16
to bean...@googlegroups.com
* Stefano Zacchiroli <za...@upsilon.cc> [2024-05-12 14:34]:
> Hello beancounters, I've just released a little tool that I've needed
> for quite a while but didn't have[^]: beangrep, a grep-like filter for
> Beancount ledgers.

I didn't even know that I needed this tool, but I absolutely do!

I often look up old transactions and beangrep makes this much easier.
Thank you!

--
Martin Michlmayr
https://www.cyrius.com/

Daniele Nicolodi

unread,
May 20, 2024, 5:22:37 PMMay 20
to bean...@googlegroups.com
Hello Stefano,

thank you for sharing this tool. It looks very useful.

From the perspective of the maintainer of bean-query, I wonder whether
a tool like this could have been implemented as a front-end that
interprets the command line options and translates them into a query for
bean-query in the form 'PRINT FROM ...'

If you thought about this approach, I would like to know which
shortcomings of bean-query didn't allow to implement bean-grep this way.
Some extensions to the PRINT statement were discussed here
https://github.com/beancount/beanquery/issues/123

Cheers,
Dan

Stefano Zacchiroli

unread,
May 21, 2024, 3:34:34 AMMay 21
to bean...@googlegroups.com
On Mon, May 20, 2024 at 11:22:33PM +0200, Daniele Nicolodi wrote:
> From the perspective of the maintainer of bean-query, I wonder whether a
> tool like this could have been implemented as a front-end that interprets
> the command line options and translates them into a query for bean-query in
> the form 'PRINT FROM ...'
>
> If you thought about this approach, I would like to know which shortcomings
> of bean-query didn't allow to implement bean-grep this way.

Probably a disappointment to you (sorry!), but I didn't consider that
approach. I'm familiar with the Python API of Beancount, so that was my
go-to design choice. Hence I don't think I've useful feedback to give
you on the applicability of the approach you propose.

Just a gut feeling, though: if the translation you suggest requires
generating queries as textual strings, that would make me feel itchy,
due to the usual SQL-style problems of generating invalid syntax,
possibly involuntary SQL-injections, etc. If OTOH there is an abstract
(AST-based?) API to do the same, it would be less of a problem.

Daniele Nicolodi

unread,
May 21, 2024, 9:57:46 AMMay 21
to bean...@googlegroups.com
On 21/05/24 09:34, Stefano Zacchiroli wrote:
> On Mon, May 20, 2024 at 11:22:33PM +0200, Daniele Nicolodi wrote:
>> From the perspective of the maintainer of bean-query, I wonder whether a
>> tool like this could have been implemented as a front-end that interprets
>> the command line options and translates them into a query for bean-query in
>> the form 'PRINT FROM ...'
>>
>> If you thought about this approach, I would like to know which shortcomings
>> of bean-query didn't allow to implement bean-grep this way.
>
> Probably a disappointment to you (sorry!), but I didn't consider that
> approach. I'm familiar with the Python API of Beancount, so that was my
> go-to design choice. Hence I don't think I've useful feedback to give
> you on the applicability of the approach you propose.
>
> Just a gut feeling, though: if the translation you suggest requires
> generating queries as textual strings, that would make me feel itchy,
> due to the usual SQL-style problems of generating invalid syntax,
> possibly involuntary SQL-injections, etc. If OTOH there is an abstract
> (AST-based?) API to do the same, it would be less of a problem.

beanquery parses a query into an AST representation which is then
"compiled" into an tree of evaluator nodes (for lack of a better name,
if someone has a better idea of how these should be called, please let
me know) that are then executed. An example of both can be obtained
running the '.explain' command in the shell:

beanquery> .explain select date + 1 from #postings
parsed statement
----------------
(select
targets: (
(target
expression: (add
left: (column
name: 'date')
right: (constant
value: 1))))
from-clause: (table
name: 'postings'))

compiled query
--------------
EvalQuery(table=<beanquery.query_env.PostingsTable object at
0x1064cdb50>, c_targets=[EvalTarget(c_expr=Add[date,int](date(<class
'datetime.date'>), EvalConstant(1)), name='date + 1',
is_aggregate=False)], c_where=None, group_indexes=None,
having_index=None, order_spec=None, limit=None, distinct=None)

The parser AST is displayed in s-expression-like format, inspired to the
one produced by tree-sitter, but it is just Python classes:

>>> beanquery.parser.parse('SELECT 1+1 FROM #')
Select(
targets=[
Target(
expression=Add(
left=Constant(value=1),
right=Constant(value=1)),
name=None)
],
from_clause=Table(name=''),
where_clause=None,
group_by=None,
order_by=None,
pivot_by=None,
limit=None,
distinct=None
)

(reformatted for readability)

The public API exposes the possibility to directly pass the parser AST.
The evaluator nodes are not public API, and writing the evaluation nodes
by hand becomes tedious for anything non-trivial.

On the other hand, beanquery exposed an DB-API 2.0 compatible API with
parameters placeholders and parameters substitutions:

>>> from datetime import date
>>> import beanquery
>>> conn = beanquery.connect('beancount:tests/test01.beancount')
>>> curs = conn.execute(
... 'SELECT date WHERE date > %s',
... (date.today(),)
... )
>>> curs.fetchall()
[]
>>> curs = conn.execute(
... 'SELECT date WHERE date > %(today)s',
... {'today': date.today()})
>>> curs.fetchall()
[]

Query parameters are not interpolated in the query, but the AST has
direct support for them:

>>> beanquery.parser.parse(
... 'SELECT date FROM #postings WHERE date > %(today)s')
Select(
targets=[
Target(
expression=Column(name='date'), name=None)
],
from_clause=Table(name='postings'),
where_clause=Greater(
left=Column(name='date'),
right=Placeholder(name='today')
^^^^^^^^^^^^^^^^^^^^^^^^^
),
group_by=None,
order_by=None,
pivot_by=None,
limit=None,
distinct=None
)


Some day I need to document all this...

Cheers,
Dan

Reply all
Reply to author
Forward
0 new messages