Beancount to DuckDB

116 views
Skip to first unread message

Martin Blais

unread,
Apr 18, 2026, 4:26:28 PM (5 days ago) Apr 18
to Daniele Nicolodi, Beancount
I've been wanting to do something like this for at least ten years... so much fun today.
I hacked together a quick prototype of exporting beancount to duckdb.

Duckdb is great because it has
- user-defined struct
- lists types
- custom udfs
- custom aggregators
You can define all these things in either C++ or Python (I used Python).

I used Gemini 3 to create code that would ingest to a local duckdb and make queries similar to beanquery, with an inventory aggregator and custom formatter output. The output looks close to beanquery's output!  I'm amazed.  (NOTE: I haven't read the source code much yet.)

Here are some example queries I had it make up:

@Daniele Nicolodi - check this out. 
We'd have to look at the details of course - or have the AI do it for us - but these couple of Python files could potentially replace beanquery.


Teck Pei Ting (Xeraph)

unread,
Apr 20, 2026, 10:42:47 PM (3 days ago) Apr 20
to Beancount
Can this possibly be a solution for folks who are having issues with long reload time in https://groups.google.com/g/beancount/c/hhmwVu4py3M

I am thinking of some sort of "Archive" function where part of the journal files that does not require changes are stored as database. Does this cut down time for parsing, say from 50,000 transactions in file form, into 45,000 transactions in database + 5,000 transactions in file?

Martin Blais

unread,
Apr 20, 2026, 11:03:11 PM (3 days ago) Apr 20
to Beancount
Not really. As any transaction can have a downstream effect on any other one due to inventory booking rules against accumulated inventory state and plug-in specific behaviors *in general* it is not possible to do partial caching. I'm sure we could think of ways to partition the sets of transactions into disjoint subsets and make reasonable assumptions on the locality of the behaviors of plugins, but that's a separate idea and fwiw I've never tried - because the obvious solution is a faster implementation, which should be fast enough. 

The duckdb prototype is basically a POC that we could possibly replace Beanquery by extending a DB engine that already exists. I feel like Beanquery is about two things: support custom table for reporting (e.g. open, close) and support inventory aggregation. The former can be implemented with a special table specification statement and the latter with struct types and a custom aggregator. The feasibility of the latter is what this demo shows is possible. The extensible SQL engine is hard to build and maintain, and I'm showing here that with just a few tricks we can leverage a really great one (duckdb).

About a faster parser: I've been playing with some new ideas with AI models and I have a fun new idea to translate beancount's core and tests into a spec and test cases that could be used to prompt a properly validated generation of the same core concepts to any language. I think this would be possible, or soon will be with the latest frontier models. So beancount could just become a carefully created set of markdown documents and validation test suite complete enough to be unambiguous, with various implementations (roll your own at will). If this works, this could portend an interesting idea for the future of development and specification of software - a description sufficient enough for the code itself to become entirely unnecessary. Think: literal programming without the programming bits.

Also, I've toyed around with AI directed translation of some core parts of the python implementation to C (parser + inventory). Definitely doable, and a more pragmatic way to speed up the current implementation incrementally. These are oddball ideas though, needs more work.



--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beancount/cfd92052-6045-4eb2-806c-2fcee143d403n%40googlegroups.com.

Justus Pendleton

unread,
Apr 21, 2026, 9:55:56 PM (2 days ago) Apr 21
to Beancount
>  I have a fun new idea to translate beancount's core and tests into a spec and test cases that could be used to prompt a properly validated generation of the same core concepts to any language

limabean has done (some? all?) of this by translating the existing beancount tests into a beancount input file and matching protobuf expected output file. Theoretically that can be used by any language, I think?


Whether building on that or starting from scratch, I think it is a great idea. It is clear that the basic concepts of plain text accounting are reasonably popular but are sometimes help back by implementation details

I thought rustledger did something similar but looking again I don't think they have

Martin Blais

unread,
Apr 22, 2026, 8:51:45 AM (2 days ago) Apr 22
to bean...@googlegroups.com
I'm thinking more of a per module document with the test cases in it and a commentary around it explaining why the test exists


Reply all
Reply to author
Forward
0 new messages