autobean.refactor: an ergonomic and lossless beancount editing library

524 views
Skip to first unread message

Archimedes Smith

unread,
Oct 5, 2022, 5:39:31 PM10/5/22
to Beancount
Hi beancounters,

One thing I have been wanting for long in beancount ecosystem is some tooling to programmatically edit my ledger. For example:
  • Formatting
    • Sort ledger entries, without losing formatting, comments, or pushtag.
    • Sort postings / meta.
    • Format non-transactions and details (e.g. inside cost) that bean-format doesn't do.
    • Enforce formatting rules.
  • Refactoring
    • Rearrange my accounts based on narration / tags / comments.
    • Optimize existing ledgers with plugins.
    • Ease plugin usage by having automatic onboarding script that inserts "plugin" directive and optimize existing ledger.
    • Automatically migrate from v2 to v3 syntax.
    • Automatically migrate from / to other bookkeeping systems.
  • Importing
    • Generate postings with total price (@@), which the official models don't support.
    • Let importers insert transactions directly to the right file, at the right line.
    • Let importers find and augment existing transactions.
      • A receipt OCR importer may find the transaction previously imported from the bank and add postings about what I bought.
    • Automatically generate refund transaction in an editor extension.
There are some existing ways doing editing:
  • Manual editing always works, but takes time.
  • Plugin is great but sometimes I want to land the changes in the file.
  • beancount.loader applies plugins, drops spacing, comments and directives (pushtag).
  • beancount.parser doesn't apply plugins, but also drops spacing, comments and directives.
As a solution to all above use cases, I've come up with autobean.refactor, which is a pure Python library providing an ergonomic yet powerful interface to edit beancount files. It's now 80% completed so I'm sharing it here, though there are still some missing pieces (coming soon):
  • Documentation
  • Performance improvements
  • Support for out-of-line tags / links in transaction
You can play with it here: https://replit.com/@SEIAROTg/autobeanrefactor-example#main.py or check out more examples in tests.

Anyone interested in this project? Any bug reports / suggestions would be appreciated.

Regards,
SEIAROTg

Ben Blount

unread,
Oct 5, 2022, 10:59:43 PM10/5/22
to Beancount
This is great, thanks for writing it! 

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/5b92826f-3fba-4f0f-a47a-c9c9d77bfd4en%40googlegroups.com.

Red S

unread,
Oct 6, 2022, 4:20:58 AM10/6/22
to Beancount
Very cool project! I've run into many/most of the use cases you mentioned.

Question: would there be a set of editing guarantees that autobean.refactor provides? Examples could be:
- every input line that is not modified by a plugin will appear exactly once in the output
- no modifications within a line will occur (other than those made by plugins)
- etc.

Also: what do I do after cloning your repo to run a hello world example?

Thanks again for sharing, and look forward to seeing this project grow and mature!

Stefano Zacchiroli

unread,
Oct 6, 2022, 6:49:46 AM10/6/22
to bean...@googlegroups.com
On Wed, Oct 05, 2022 at 02:39:31PM -0700, Archimedes Smith wrote:
> Anyone interested in this project? Any bug reports / suggestions would be
> appreciated.

This is an amazing project. I wanted to have something similar for a
very long time. Now, in terms of suggestions/questions:

- I had already bookmarked in the past your repo (due to other cool
stuff you've in there). In spite of that I'd have never found this
project in there without this announcement email of yours. Would you
consider splitting autobean.refactor out into its own project? It'd
help a lot with visibility.

- I could really use some API reference documentation for the library,
as the tests are hard to navigate to understand how the manipulation
logics works.

- My initial UX idea for something like this was more like a user
language connected to some CLI driver (in the style of sed/awk/perl,
but with a DSL for Beancount). Is that something you've considered? Of
course, if it will ever come to be, it will be "just" an additional
layer on top of the manipulation library that you've now.

Ere are some of my frequent use cases for these manipulations, in case
it helps you double-checking if they are already supported or not:

- Rename accounts in batch, without messing up indentation. Bonus point:
detect that the rename is not creating clashes (unless explicitly
requested).

- Change the alignment column of amounts (purely syntactical change).

- Move metadata "up", from postings to the main transactions or
vice-versa (move them "down").

- Select transactions by some predicate (e.g., they have a given
narration or metadata) and perform in batch modifications to selected
ones (e.g., add/remove a metadata, set the narration to something,
etc.)

- Ensure consistent ordering in metadata postings (e.g., "bank-label" is
always first, followed by "author", then "card", etc.).

Hope this helps,
Cheers
--
Stefano Zacchiroli . za...@upsilon.cc . upsilon.cc/zack _. ^ ._
Full professor of Computer Science o o o \/|V|\/
Télécom Paris, Polytechnic Institute of Paris o o o </> <\>
Co-founder & CTO Software Heritage o o o o /\|^|/\
Former Debian Project Leader & OSI Board Director '" V "'

Archimedes Smith

unread,
Oct 6, 2022, 8:09:57 AM10/6/22
to Beancount
@Red

I'm advertising it as "lossless" editing with strong guarantee that:
  • With no editing, parsing + printing will output the exactly same contents.
  • When updating something (e.g. account in a posting), no character outside that object may be modified.
  • When inserting / removing something (e.g. new transaction), no character outside that object or its surrounding spacing may be modified.
  • When moving something (e.g. reordering transaction), no character except the surrounding spacing may be modified.
I'll also include that in the doc I'm writing.

> What do I do after cloning your repo to run a hello world example?

The replit example should work out of the box. If you are cloning the repo you'll need:
  • The example main.py from replit (it's not in the repl)
  • Python 3.10+ (I used some new typing features)
  • Lark (pip install lark)
  • PYTHONPATH covering the repo
@Stefano

> split the project
Certainly a good idea considering this has fairly different dependencies and Python version requirements. One thing stopped me from doing so is that github.com/autobean has been claimed by someone else. I'll rethink about this after finalizing the development

> API reference doc
Hopefully the replit example helps a bit. The full reference should come in a few weeks.

> CLI interface
I think this library is more powerful than what the command line expressive power allows, which is why it's now a library. Having a CLI interface for simple / common case is certainly a good idea and we can always creating them at a higher level as you said. I'm happy with some of such CLI commands to be included in this project but I don't have a concrete plan right now. Contribution is always welcomed.

> Use case coverage check
I can confirm all the use cases you listed are supported. Note that this is a relatively low level AST manipulation library and thus:
  • There is no dedicated interface for column alignment but you can do that by setting the number of spaces based on token positions.
  • There is no dedicated interface for clash detection or order enforcement, but you'll get a lot more information than beancount.loader (+spacing, +comments, +directives, +ordering, ...) so they are all doable by using this library.

Red S

unread,
Oct 7, 2022, 2:46:49 PM10/7/22
to Beancount
I'm advertising it as "lossless" editing with strong guarantee that:

Great!


> What do I do after cloning your repo to run a hello world example?

The replit example should work out of the box. If you are cloning the repo you'll need:
  • The example main.py from replit (it's not in the repl)
  • Python 3.10+ (I used some new typing features)
  • Lark (pip install lark)
  • PYTHONPATH covering the repo
I've got all the other bullet points above covered except for the first one. "git grep replit" in your repo didn't turn up anything. I'm not familiar with replit, and a quick (10 second) search didn't bring up much. If you could please give me a pointer here, that'd be great. Thank you!

Red S

unread,
Oct 10, 2022, 3:19:50 AM10/10/22
to Beancount
  • I've got all the other bullet points above covered except for the first one. "git grep replit" in your repo didn't turn up anything. I'm not familiar with replit, and a quick (10 second) search didn't bring up much. If you could please give me a pointer here, that'd be great. Thank you!

For anyone following along, OP pointed me to the link in the original post that I missed (thank you!).
Message has been deleted

Red S

unread,
Oct 16, 2022, 12:35:23 AM10/16/22
to Beancount
For anyone like myself on Ubuntu 20.04 and needing to install everything to get this working:

$ lsb_release -a  # my host for reference
No LSB modules are available.
Distributor ID:    Ubuntu
Description:    Ubuntu 20.04.5 LTS
Release:    20.04
Codename:    focal


sudo apt update && sudo apt upgrade -y
sudo apt install software-properties-common -y

https://askubuntu.com/questions/1398568/installing-python-who-is-deadsnakes-and-why-should-i-trust-them
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.10 python3.10-distutils python3.10-dev -y
wget https://bootstrap.pypa.io/get-pip.py
python3.10 get-pip.py
export PYTHONPATH=`pwd`/autobean

# Download main.py from https://replit.com/@SEIAROTg/autobeanrefactor-example#main.py
python3.10 main.py


Red S

unread,
Oct 16, 2022, 12:56:24 AM10/16/22
to Beancount
I played around with it for just a few minutes. Very nicely done, OP!

I'm wondering what is next in line for developing this. Perhaps I'm missing it, but the main thing I see is developing high level functions for the functionality you described in your original post, and others like Stefano brought up. For example, a high-level rename_account(old_name, new_name) function. Curious what your thoughts are.

Archimedes Smith

unread,
Oct 19, 2022, 3:22:05 PM10/19/22
to Beancount
Glad it feels useful to you.

I'd like to limit the scope of this project to AST manipulation and things currently on my plan are:
  • Performance improvements. Most operations currently unnecessarily takes O(number of token) time.
  • Stabilize the interface (e.g. simpler interface to get token position, print model, parse files recursively following `include`, ...)
  • Documentation
  • Support of out-of-line tags / links.
I'll probably build higher level stuff in separate projects afterwards but I don't have a concrete plan yet (the first one might be a formatter that also supports sorting). Also feel free to build your own stuff!

In relation to `rename_account(old_name, new_name)`, I personally feel it is a simple task where regex substitution does better than this library. This library would be more useful for slightly more complex cases like "I'd like to split all my expenses on tea, as marked by a posting meta, into a new subaccount" but those use cases tend to be very case-specific and difficult to generalize. But if you have some ideas what people might need in common, feel free to just build higher level tools on top of it!

Red S

unread,
Oct 19, 2022, 9:48:57 PM10/19/22
to Beancount
Thanks for that clarity. That scope makes perfect sense. I'll build what I need and share, and hope others do the same :).

Agree with rename: I was thinking of that as a "hello world" starter high-level function, but there are probably more meaningful things. For example, to "bake in" plugin-transformed transactions. Eg: zerosum. I do have per-account files, and need to first find a good way to write back to the same location a transaction was read from.

Thanks, and I look forward to the updates!
Reply all
Reply to author
Forward
0 new messages