document organization: unifying statements with everything else

734 views
Skip to first unread message

Stefano Zacchiroli

unread,
May 4, 2018, 2:42:11 AM5/4/18
to bean...@googlegroups.com
Heya beancounters,

I'm struggling to organize my financial documents in a way that is
both consistent and supported by Beancount/Fava and I'd like to hear
from others what best practices you're using.

The main disconnect I notice is between statements documents and
everything else:

1) via the "documents" option beancount has great support for bank
statements and similar documents. You just drop them using a name
like documents/Assets/Bank/Checking/YYYY-MM-DD.pdf and they show up
in your ledger. I want no more/less than this.

2) but I also want to store other documents and associate them to either
transactions as a whole or even individual transaction postings.
Examples are: receipts (for payments, donations, etc.), invoices,
paychecks, etc.

As far as I can tell Beancount itself has not direct support for (2),
please correct me if I'm wrong. You can drop them into the same
directory structure for (1), but that doesn't associate them with
individual transactions or postings. Fava, OTOH, has the
"link_statement" plugin which supports (2). Aside from a lack of
generality[^], that allows to link documents to transactions/legs.

[^]: https://github.com/beancount/fava/issues/740

But where do you store the documents for use case (2)? If you store them
in the same directory for (1), there is a use clash, and documents
appear both as associated to transactions and in the journal flow --- at
least conceptually, no matter what Fava actually does, which I find
annoying.

It seems to me we lack a Beancount *native* (as opposed to
Fava-supported) way of associating documents to either transactions or
postings. Or am I missing something here? Is this planned and/or being
discussed anywhere else?

Thanks in advance for your thoughts / comments on this.
Cheers
--
Stefano Zacchiroli . za...@upsilon.cc . upsilon.cc/zack . . o . . . o . o
Computer Science Professor . CTO Software Heritage . . . . . o . . . o o
Former Debian Project Leader & OSI Board Director . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »

Justus Pendleton

unread,
May 4, 2018, 5:24:05 AM5/4/18
to Beancount
On Friday, May 4, 2018 at 1:42:11 PM UTC+7, Stefano Zacchiroli wrote:
It seems to me we lack a Beancount *native* (as opposed to 
Fava-supported) way of associating documents to either transactions or
postings. Or am I missing something here? Is this planned and/or being
discussed anywhere else? 

Isn't the link_statements plugin just a regular beancount plugin that can be used even if you don't use fava? What would be the difference between that plugin and a beancount native way? I'm not quite following what you want here; could you elaborate?

Martin Blais

unread,
May 4, 2018, 9:13:18 AM5/4/18
to Beancount
On Fri, May 4, 2018 at 2:42 AM, Stefano Zacchiroli <za...@upsilon.cc> wrote:
Heya beancounters,

  I'm struggling to organize my financial documents in a way that is
both consistent and supported by Beancount/Fava and I'd like to hear
from others what best practices you're using.

The main disconnect I notice is between statements documents and
everything else:

1) via the "documents" option beancount has great support for bank
   statements and similar documents. You just drop them using a name
   like documents/Assets/Bank/Checking/YYYY-MM-DD.pdf and they show up
   in your ledger. I want no more/less than this.

In relational terms, this is a join of the list of accounts and the documents.
You obtain an association list of
  (account, document)
  ...
Where each document belongs to at most one account.


2) but I also want to store other documents and associate them to either
   transactions as a whole or even individual transaction postings.
   Examples are: receipts (for payments, donations, etc.), invoices,
   paychecks, etc.

In relational terms, this is a join of the transactions and the documents.
You want to obtain an association list of 
  (transaction, document)
Where each document belongs to at most one transaction.


As far as I can tell Beancount itself has not direct support for (2),
please correct me if I'm wrong.

It supports neither. The web interface performs the first join implicitly by grouping all the directives by account and then rendering journals for any account.


You can drop them into the same
directory structure for (1), but that doesn't associate them with
individual transactions or postings.  Fava, OTOH, has the
"link_statement" plugin which supports (2). Aside from a lack of
generality[^], that allows to link documents to transactions/legs.

[^]: https://github.com/beancount/fava/issues/740

But where do you store the documents for use case (2)? If you store them
in the same directory for (1), there is a use clash, and documents
appear both as associated to transactions and in the journal flow --- at
least conceptually, no matter what Fava actually does, which I find
annoying.

It seems to me we lack a Beancount *native* (as opposed to
Fava-supported) way of associating documents to either transactions or
postings. Or am I missing something here? Is this planned and/or being
discussed anywhere else?

I could add clean APIs to perform either of these joins, given a particular meta-data field.
For the transactions/documents join, the match could be partial (e.g. unique substring on the document filenames).
Let me know.

Also, how do you need to query this?
What do you want to produce?


 

Thanks in advance for your thoughts / comments on this.
Cheers
--
Stefano Zacchiroli . za...@upsilon.cc . upsilon.cc/zack . . o . . . o . o
Computer Science Professor . CTO Software Heritage . . . . . o . . . o o
Former Debian Project Leader & OSI Board Director  . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscribe@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/20180504064209.GI11188%40upsilon.cc.
For more options, visit https://groups.google.com/d/optout.

Stefano Zacchiroli

unread,
May 5, 2018, 4:07:20 AM5/5/18
to bean...@googlegroups.com
Heya,

On Fri, May 04, 2018 at 02:24:04AM -0700, Justus Pendleton wrote:
> Isn't the link_statements plugin just a regular beancount plugin that
> can be used even if you don't use fava? What would be the difference
> between that plugin and a beancount native way? I'm not quite
> following what you want here; could you elaborate?

sure. So, I see the following issues with the link_statements plugin:

- it is not distributed with beancount, so while the documents option is
built-in (hinting at the fact that that's the official way of linking
documents to txns), link_statements seems to be a sub-par / not
supported

-> this would be easy to solve if Martin is OK with distributing it as
part of Beancount, but I think a more general discussion about what
are the requirements for linking documents to different elements of
the Beancount data model, which is what I'm trying (probably badly,
sorry about that) to start here

- it is not alternative to document entries. It requires you to have
free-floating document entries in your global ledger and then
heuristically associate them to transactions/postings

-> this feels awkward. A receipt for the grocery store does not belong
to the daily flow of transactions happening that day. It really
belong to a transactions. This smell as if we need a more
expressive way of informing Beancount of which metadata keys point
to documents, or something such

- it is not generic (not all documents are "statements")

-> this is a minor point, which looks like will be fixed soon

Hope this clarifies,

Stefano Zacchiroli

unread,
May 5, 2018, 4:18:04 AM5/5/18
to bean...@googlegroups.com
On Fri, May 04, 2018 at 09:12:55AM -0400, Martin Blais wrote:
> > 1) via the "documents" option beancount has great support for bank
> In relational terms, this is a join of the list of accounts and the
> documents.
[...]
> 2) but I also want to store other documents and associate them to either
> > transactions as a whole or even individual transaction postings.
> > Examples are: receipts (for payments, donations, etc.), invoices,
> > paychecks, etc.
> In relational terms, this is a join of the transactions and the documents.
> You want to obtain an association list of
> (transaction, document)
> Where each document belongs to at most one transaction.

Ack on (1).

On (2) you also need to support associating documents to individual
postings --- you can do it in various ways in a relational model, I'm
not sure which one you prefer.

> As far as I can tell Beancount itself has not direct support for (2),
> > please correct me if I'm wrong.
>
> It supports neither. The web interface performs the first join implicitly
> by grouping all the directives by account and then rendering journals for
> any account.

Well, no. The fact that the "documents" option exists makes Beancount de
facto support use case (1). You can just drop documents in dirs, and
they will show up in the transaction flow. I'm wondering whether we can
have something similar for associating documents to individual
transactions / postings. I'm not clear on how/if the data model can
support that.

What is suboptimal right now is that people are using link_statements,
making them have document entries appear in the global ledger flow for
something that is txn/posting-specific.

> I could add clean APIs to perform either of these joins, given a particular
> meta-data field.
> For the transactions/documents join, the match could be partial (e.g.
> unique substring on the document filenames).
> Let me know.
>
> Also, how do you need to query this?
> What do you want to produce?

Ideally, I'd like to:

1) "type" (in the sense of type sytems), metadata entries, letting
Beancount know that a specific metadata key should point to a
document (via an URI, or a path, I don't particularly care). This
will already allow a number of nice checks:

- return an error if the link is dangling

- query the txn and ask: do you link to any document? <- this will in
turn allow fava to render document links associated to txn /
postings even if there are no matching document entries in the
global ledger flow

2) understand where to put the actual documents on disk, in a way that
doesn't get in the way of the "documents" option. This looks
complicated because:

- on the one hand I want a single dir hierarchy where to put
documents, that works for both global documents (that should appear
in the flobal ledger flow) and transaction-specific documents

- on the other hand I don't want to have document entries generated
for transaction-specific documents

And the two look incompatible.

I guess that if we inform Beancount of which metadata entries point
to documents, it will be possible for the implementation of the
"documents" option to exclude them and not generate matching
"document" entries. But I'm not sure if you'd consider this a hack or
not.

Martin Blais

unread,
May 6, 2018, 12:34:41 PM5/6/18
to Beancount
On Sat, May 5, 2018 at 4:18 AM, Stefano Zacchiroli <za...@upsilon.cc> wrote:
On Fri, May 04, 2018 at 09:12:55AM -0400, Martin Blais wrote:
> > 1) via the "documents" option beancount has great support for bank
> In relational terms, this is a join of the list of accounts and the
> documents.
[...]
> 2) but I also want to store other documents and associate them to either
> >    transactions as a whole or even individual transaction postings.
> >    Examples are: receipts (for payments, donations, etc.), invoices,
> >    paychecks, etc.
> In relational terms, this is a join of the transactions and the documents.
> You want to obtain an association list of
>   (transaction, document)
> Where each document belongs to at most one transaction.

Ack on (1).

On (2) you also need to support associating documents to individual
postings --- you can do it in various ways in a relational model, I'm
not sure which one you prefer.

I would interpret one of the metadata fields as a substring of the set of existing documents (the list of which is provided by the full set of Document directives, or perhaps the union of those associated with the accounts of the transactions), possibly matching multiple documents.

 

> As far as I can tell Beancount itself has not direct support for (2),
> > please correct me if I'm wrong.
>
> It supports neither. The web interface performs the first join implicitly
> by grouping all the directives by account and then rendering journals for
> any account.

Well, no. The fact that the "documents" option exists makes Beancount de
facto support use case (1). You can just drop documents in dirs, and
they will show up in the transaction flow. I'm wondering whether we can
have something similar for associating documents to individual
transactions / postings. I'm not clear on how/if the data model can
support that.

As above, it's a join. The stream of Document directives provides the set of available documents, and a metadata field on the transaction associates the set of matching documents with the transactions, providing a new data structure for the join, or perhaps just updating metadata fields with the list of documents (*).


What is suboptimal right now is that people are using link_statements,
making them have document entries appear in the global ledger flow for
something that is txn/posting-specific.

It would be possible to not list the documents that are associated with transactions in the journals, if that's what you mean.
That's a web UI option.


> I could add clean APIs to perform either of these joins, given a particular
> meta-data field.
> For the transactions/documents join, the match could be partial (e.g.
> unique substring on the document filenames).
> Let me know.
>
> Also, how do you need to query this?
> What do you want to produce?

Ideally, I'd like to:

1) "type" (in the sense of type sytems), metadata entries, letting
   Beancount know that a specific metadata key should point to a
   document (via an URI, or a path, I don't particularly care).

Same as I describe above. Imagine a plugin that runs the join I'm proposing and removes matched Document directives from the list and moves them to metadata.

 
This
   will already allow a number of nice checks:

   - return an error if the link is dangling

   - query the txn and ask: do you link to any document? <- this will in
     turn allow fava to render document links associated to txn /
     postings even if there are no matching document entries in the
     global ledger flow

What do you mean by "query"? The web interface can do whatever it wants.
It could run the suggested plugin and if the metadata field is available render those "transaction documents" differently.



2) understand where to put the actual documents on disk, in a way that
   doesn't get in the way of the "documents" option. This looks
   complicated because:

   - on the one hand I want a single dir hierarchy where to put
     documents, that works for both global documents (that should appear
     in the flobal ledger flow) and transaction-specific documents

   - on the other hand I don't want to have document entries generated
     for transaction-specific documents

   And the two look incompatible.

I don't see why not. The purpose of the Document directive is to declare the existence of a document associated with the ledger. The plugin could take some of those out of the stream to move them into the transaction's metadata.

Metadata so far is intended to be used by users only and by plugins.

BTW, I'm happy to support something like this in Beancount itself, that functionality could move out of Fava, it's not web-specific.


(At this stage this thread is longer than it would take time to program it...)




   I guess that if we inform Beancount of which metadata entries point
   to documents, it will be possible for the implementation of the
   "documents" option to exclude them and not generate matching
   "document" entries. But I'm not sure if you'd consider this a hack or
   not.

Cheers
--
Stefano Zacchiroli . za...@upsilon.cc . upsilon.cc/zack . . o . . . o . o
Computer Science Professor . CTO Software Heritage . . . . . o . . . o o
Former Debian Project Leader & OSI Board Director  . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscribe@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.

Stefano Zacchiroli

unread,
May 6, 2018, 1:01:12 PM5/6/18
to bean...@googlegroups.com
On Sun, May 06, 2018 at 12:34:17PM -0400, Martin Blais wrote:
> It would be possible to not list the documents that are associated with
> transactions in the journals, if that's what you mean.
> That's a web UI option.
[...]
> Same as I describe above. Imagine a plugin that runs the join I'm proposing
> and removes matched Document directives from the list and moves them to
> metadata.

This is the part that wasn't entirely I clear to me. I thought that the
intended meaning of document directives was to stick document to a
specific point in time in the transaction journal.

While the interpretation you're suggesting here is that it just makes
Beancount aware of the existence of documents, the date is just an
attribute documents will have (because it's required for all Beancount
directives); but documents have no special meanings other than what
plugins / UIs make of them.

This addresses my concern, thank you. I will stop worrying about mixing
transaction-specific documents and statements.

> BTW, I'm happy to support something like this in Beancount itself,
> that functionality could move out of Fava, it's not web-specific.

That would be helpful and make more users use documents associated to
transactions, I think. But is of course up to the Fava people to decide
if they want to move the plugin over or not.

Martin Blais

unread,
May 6, 2018, 1:31:31 PM5/6/18
to Beancount
On Sun, May 6, 2018 at 1:01 PM, Stefano Zacchiroli <za...@upsilon.cc> wrote:
On Sun, May 06, 2018 at 12:34:17PM -0400, Martin Blais wrote:
> It would be possible to not list the documents that are associated with
> transactions in the journals, if that's what you mean.
> That's a web UI option.
[...]
> Same as I describe above. Imagine a plugin that runs the join I'm proposing
> and removes matched Document directives from the list and moves them to
> metadata.

This is the part that wasn't entirely I clear to me. I thought that the
intended meaning of document directives was to stick document to a
specific point in time in the transaction journal. 

While the interpretation you're suggesting here is that it just makes
Beancount aware of the existence of documents, the date is just an
attribute documents will have (because it's required for all Beancount
directives); but documents have no special meanings other than what
plugins / UIs make of them.

That's correct.
The Documents directive creates a list of documents.
Documents have a date, and currently are required by the grammar to have an associated account (though that could be changed).




This addresses my concern, thank you. I will stop worrying about mixing
transaction-specific documents and statements.

Oh yes, you should definitely not have to worry about that.


> BTW, I'm happy to support something like this in Beancount itself,
> that functionality could move out of Fava, it's not web-specific.

That would be helpful and make more users use documents associated to
transactions, I think. But is of course up to the Fava people to decide
if they want to move the plugin over or not.

Cheers
--
Stefano Zacchiroli . za...@upsilon.cc . upsilon.cc/zack . . o . . . o . o
Computer Science Professor . CTO Software Heritage . . . . . o . . . o o
Former Debian Project Leader & OSI Board Director  . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscribe@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages