Beancount Revisited

419 views
Skip to first unread message

Runar Petursson

unread,
May 16, 2020, 11:01:12 PM5/16/20
to bean...@googlegroups.com
Hey Everyone (and esp Martin!),

After over 10 years on my TODO list I've finally gotten around to migrating to Beancount.  It's Awesome!  I'm glad I waited from those early versions, as the product has really come along and is a pleasure to work with.

Of course, I've fallen into most of the newbie traps, and will have to iterate the workflow the hard way.  I'd like to describe my envisioned workflow, so someone can stop me before I go too far down the rabbit hole (though I'm already a week in).

I have about 20 years worth of various data I'd eventually like to import/organize from various sources.  These have existing Account tags and Categories etc..  For now, I'm starting with 2020.  Existing data is coming from sources including tons of OFX Files, Excel sheets, CSV,  Google Sheets, PDF's, various trading exchanges and a bunch of crypto stuff.  I'll tackle them one at a time, as those workflows are fairly well defined and you have great existing tooling.  I'm using the full bean-file workflow.

Importers are easy and intuitive.  I've written a new IB importer based on the great ibflex library.  I've also rolled an OFX importer based on ofxtools, which were both so robust that they worked almost immediately on all of my banks.  The thing I'd like to explore here is configuring the importer on the account 'open' meta as well as rules integration.  I think the current 'import.py' has already become difficult to manage as I have too many accounts.

My real mental block was around how to organize my beans.  Single file? Where do I put new transactions?  What about staging transactions from imports?  Auto-match/tag/payee?  What about other entities (wholly owned companies, partially owned companies).  How would I track passive income, trading income etc.  There seem to be about as many workflows as users.

The general structure I've decided on is to have a yearly bean file-- "2020.bean", linked from a root file.  The root file has all of the accounts 'open' and 'close', common commodities, options and all includes.  This file is manually managed.  There are some additional files (prices, trading transactions), but they tend to be verbose and simple formats.

My yearly file is "round-trippable" in that I am able to read it, insert/modify entries and rewrite it using a simple CLI tool.  This allows me to inject new transactions in the right place from the importers.  The tradeoff is I have a fixed sort order (mostly by date/type) and no comments.  Because it's generated though, I can inject Folds which make the file quite easy to navigate.  My big insight here was that I'd think of my year in chronological order and not on a per account basis (much like beancount internally).

This file becomes my "Personal Journal" and I am mostly working at the end of the file (today).  I also make use of transaction flags, so that once it's got a '*', I don't auto-do anything.  Diff/git/balance are my friends for any batch changes.  Are we free to add new flags--I've seen some conflicting assertions on that.

I've mostly handled the duplication with a combination of unique key (in meta-data) and obvious duplicate checking.

Now that I have a fixed format and am able to modify en-mass the yearly file.  I was able to start thinking about auto-tagging/matching in a more robust fashion.  To that regard I'm building a regular expression rules engine.  This is a similar path that many people seem to have gone down.  Finding the magic sweet-spot between learning from existing transactions, having matching rules and a neural network/AI :-).  I'm quite simple, so I just read a "rules.yaml" and modify the existing entries, applying the account modifications.  It will also allow tagging, payee, even injection of meta-data (parsed from the Narration for example).  I have lots of ideas here and am in danger of falling into a black hole of something super complicated that nobody else would find usable.  But the workflow looks promising already in early alpha.

I also haven't quite figured out how best to handle assets held in companies.  While it seems obvious that a separate legal entity needs its own set of books (and files its own taxes), I have several companies where I'd like to integrate portions of the balance sheet/expenses into my personal bean file.  An example is a legal entity that manages an AirBnB and owns the property.  So, do I reflect those properties on my balance sheet?  It becomes even more messy when there's a business partner involved. 

I've kind of decided to keep each legal entity in its own bean project.  I reflect advances to/from owner (there tend to be a lot of those) on each project as well as a "liquidation" value of the company (owner-equity).  Then I might just write some scripts to generate owner-equity/share as a price directives into my personal reports.  And keep the personal bean simple.  The downside is my personal rolled up reports hide a fairly large part of my passive income/expenses.  I can't answer a simple question like "What % of my investments are in Real Estate." or "How much money did I make from AirBnB" without perhaps combining multiple entities.  This seems like a reporting issue and not a structural one though--and could be solved at a higher reporting level.  Anyone have experience with this?

On the horizon I'd like to write:

- Google Sheets Round-Trip.  Be able to export simple transactions to a sheet, let my assistant tag them and read them back in.  I saw some work on reporting to Google Sheets in the project already.
- Crypto -- Haven't even started thinking about this, though saw a few importers online.  For now I just make note of the balances.
- VIM -- I've started some tools on this front, would like to be able to do most common tasks from within vim, like merging files, applying rules, and intelligently modifying entries.  Seems that most of the python hooks are already in beancount (like parsing single entry etc.). I'm already using the existing VIM plugin, but would like to do so much more.

I'll release some of the code on github if anyone is interested in any of these sorts of features, but would want to make sure it's a complete usable workflow first.

And Martin, if there's anything I can help with let me know, though I keep waiting for you to switch to GitHub!

Best,

Runar

Martin Michlmayr

unread,
May 16, 2020, 11:10:43 PM5/16/20
to bean...@googlegroups.com
* Runar Petursson <ru...@runar.net> [2020-05-17 11:00]:
> I've also rolled an OFX importer based on ofxtools, which were both
> so robust that they worked almost immediately on all of my banks.

What was the reason you built your own rather than using the one
shipped with beancount?

And yes, I hope you'll publish your scripts.
--
Martin Michlmayr
https://www.cyrius.com/

Runar Petursson

unread,
May 16, 2020, 11:54:05 PM5/16/20
to bean...@googlegroups.com
On Sun, May 17, 2020 at 11:10 AM Martin Michlmayr <t...@cyrius.com> wrote:
* Runar Petursson <ru...@runar.net> [2020-05-17 11:00]:
> I've also rolled an OFX importer based on ofxtools, which were both
> so robust that they worked almost immediately on all of my banks.

What was the reason you built your own rather than using the one
shipped with beancount?

I wanted to have better handling of duplicates by tagging transactions with unique IDs, but this might just be my naive understanding of how it's currently handled.  It's also going to be the foundation for my dynamic configuration.  I found the ofxtools library to do a great job of hiding all of the ugly parsing, so it's more just mapping fields than doing any real work.
 
And yes, I hope you'll publish your scripts.
I uploaded the wip to https://github.com/runarp/coolbeans -- please excuse the mess.  I'm iterating fairly quickly on it.
 
I'd love to hear your approach to your rules engine.

--
Martin Michlmayr
https://www.cyrius.com/

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/20200517030955.GB21479%40jirafa.cyrius.com.

Martin Blais

unread,
May 17, 2020, 11:59:34 AM5/17/20
to Beancount
On Sat, May 16, 2020 at 11:01 PM Runar Petursson <ru...@runar.net> wrote:
Hey Everyone (and esp Martin!),

After over 10 years on my TODO list I've finally gotten around to migrating to Beancount.  It's Awesome!  I'm glad I waited from those early versions, as the product has really come along and is a pleasure to work with.

Whoa a blast from the past

A bit of history for everyone's benefit: Runar is the friend who introduced me to Quicken. We were colleagues at a company in California back and during a visit with some friends dropping by his place, he happened to be finishing updating his accounts on his personal computer and showed me how he was doing his bookkeeping using Quicken. This was the first mention ever I heard of "double-entry accounting."  That demo was my first encounter with the methods. Without that chance demo I may have never walked this far down the bookkeeping rabbit hole...

Runar, I still remember vividly in that discussion 13 years ago, you explained you were "using Quicken but it wasn't the right thing to do, that you ought to be using double-entry accounting instead."  I didn't understand anything at the time and I got curious so I looked into it. This led me to try to replace my various hopeless incomplete spreadsheets and text files, and I found John Wiegley's Ledger, which is the original inspiration for a text-based accounting system. And then I had some ideas that required Python so I built Beancount v1t a, which at first was compatible with Ledger syntax and a few years thereafter I redesigned the syntax and rewrote the whole thing as Beancount v2 (today's version). Beancount v3 is in the works (more below).
 

Of course, I've fallen into most of the newbie traps, and will have to iterate the workflow the hard way.  I'd like to describe my envisioned workflow, so someone can stop me before I go too far down the rabbit hole (though I'm already a week in).

I have about 20 years worth of various data I'd eventually like to import/organize from various sources.  These have existing Account tags and Categories etc..  For now, I'm starting with 2020. 

I think that's the right approach. Are you still using Quicken or have migrated to something else? There exist converters out there for a variety of formats, made by others. I think the best way to get going is to export in a parseable format and write a converter script.
See this doc in which I list all the contributions I've come across and which are posted on this list, there might be something that fits the bill:

One problem you will encounter for sure if that those 20 years will become slow to process. I'm in that situation too and I'm going to address this with a C++ rewrite (more at the end of this email).


Existing data is coming from sources including tons of OFX Files, Excel sheets, CSV,  Google Sheets, PDF's, various trading exchanges and a bunch of crypto stuff.  I'll tackle them one at a time, as those workflows are fairly well defined and you have great existing tooling.  I'm using the full bean-file workflow.

Importers are easy and intuitive.  I've written a new IB importer based on the great ibflex library.  I've also rolled an OFX importer based on ofxtools, which were both so robust that they worked almost immediately on all of my banks.  The thing I'd like to explore here is configuring the importer on the account 'open' meta as well as rules integration.  I think the current 'import.py' has already become difficult to manage as I have too many accounts.

Thanks for the success reports. I haven't looked at ofxtools in a while and I wasn't aware of ibflex (I have to do that eventually, I mainly use another broker but started investing in commodities in IB, so far reconciling that one by hand). With all the regulations it seems IB is the only broker that offers a decent solution when moving between countries.


My real mental block was around how to organize my beans.  Single file?

That's what I do and recommend. I use beancount major mode and outline minor mode in Emacs to get around the file, and lots of i-search.
Many people here use multiple files but make sure to define all your options in the top-level file.

 
Where do I put new transactions? 

Depends on the account type. I tend to organize them in sections by account. See the large auto-generated example file in the source code.
e.g. a credit card will have a dedicated section. Payments to it remain in the checking account (you have to choose one side).
(v3 will likely offer a better solution for keeping all of a transaction's postings together.)

 
What about staging transactions from imports? 

I copy-paste them manually. You can come up with a system (e.g., insert a tag and have your script automatically insert the imported transaction in your Ledger).

 
Auto-match/tag/payee? 

That's really quite custom and you should write a plugin. Beancount doesn't include something reusable for this yet. Do that in your importer (have your importer load the contents of your ledger if desired).


What about other entities (wholly owned companies, partially owned companies). 

I would use separate Ledgers for these. I go further: I even keep a separate Ledger for my kid's expenses.
I outline some methods and tools here:
Overall you'll have to design an account system that works the way you prefer. 
Techniques involve:
- Keep a running account in your personal ledger for transfers and accruals with your other entities
- Use scripts to ensure a subset of transactions match between ledgers
- Import from and export to spreadsheets

 
How would I track passive income, trading income etc.  There seem to be about as many workflows as users.

For very large number of trades, you may find the system insufficient. I'd done some motions toward adding metadata to postings in order to be able to pick up buy/sell pairs in the past, here: https://bitbucket.org/blais/beancount/src/tip/beancount/plugins/book_conversions.py  But this was done before the current booking code implementation. The v3 booking code will insert a reference to the augmenting posting from each reducing posting, and this should make it easy to scan the stream of reductions and join both legs to produce a table of trades (and a function will be provided that does that).

For crypto, a lot of people are asking questions on this list, refer to the list, but you will find that it's difficult to use with Beancount if you want to think of those things *both* as currencies to spend and as investment simultaneously. I think tracking some set of coins as investments, and another set of coins to be spent, separately, fits the current model better. If crypto ever actually takes one (the criteria being: people actually use them routinely to buy and sell real goods & services, not speculate) I'll have to review the design so that the routine "sell investment to buy goods" scenario is easier to input and book.

 
The general structure I've decided on is to have a yearly bean file-- "2020.bean", linked from a root file.  The root file has all of the accounts 'open' and 'close', common commodities, options and all includes.  This file is manually managed.  There are some additional files (prices, trading transactions), but they tend to be verbose and simple formats.

My yearly file is "round-trippable" in that I am able to read it, insert/modify entries and rewrite it using a simple CLI tool. 

Make sure NOT to use the printer to output back transactions that have been read, because the parsing & plugins processing may insert a lot of details you don't want to have reproduced in the source. Work with the source. (The separation between parsed source and processed stream of directives will be more radical and better delineated in the next version.)

 
This allows me to inject new transactions in the right place from the importers.  The tradeoff is I have a fixed sort order (mostly by date/type) and no comments.  Because it's generated though, I can inject Folds which make the file quite easy to navigate.  My big insight here was that I'd think of my year in chronological order and not on a per account basis (much like beancount internally).

This file becomes my "Personal Journal" and I am mostly working at the end of the file (today).  I also make use of transaction flags, so that once it's got a '*', I don't auto-do anything.  Diff/git/balance are my friends for any batch changes.  Are we free to add new flags--I've seen some conflicting assertions on that.

Not at the moment. Only a small subset of characters are supported
Current limitation is due to syntax definition and lexer implementation.
(v3 will have a better story around the flag syntax, this need to be tightened. Use metadata. v3 will also have a UTF8 parser beyond the strings - I've already prototyped something.)
 
 
I've mostly handled the duplication with a combination of unique key (in meta-data) and obvious duplicate checking.

Now that I have a fixed format and am able to modify en-mass the yearly file.  I was able to start thinking about auto-tagging/matching in a more robust fashion.  To that regard I'm building a regular expression rules engine.  This is a similar path that many people seem to have gone down.  Finding the magic sweet-spot between learning from existing transactions, having matching rules and a neural network/AI :-).  I'm quite simple, so I just read a "rules.yaml" and modify the existing entries, applying the account modifications.  It will also allow tagging, payee, even injection of meta-data (parsed from the Narration for example).  I have lots of ideas here and am in danger of falling into a black hole of something super complicated that nobody else would find usable.  But the workflow looks promising already in early alpha.

I would recommend keeping it simple. This is something a few regexp rules can take you 95% of the way, and you'd want to review the remaining 5% of imported categorizations anyway.


I also haven't quite figured out how best to handle assets held in companies.  While it seems obvious that a separate legal entity needs its own set of books (and files its own taxes),

Yes
 
I have several companies where I'd like to integrate portions of the balance sheet/expenses into my personal bean file.  An example is a legal entity that manages an AirBnB and owns the property.  So, do I reflect those properties on my balance sheet?  It becomes even more messy when there's a business partner involved. 

A while ago I discussed the idea of merging ledgers into a single ledger in a number of ways, in the context of Fund Accounting. 
See this doc for details of that discussion with Carl Hauser:

The simplest thing to do is to keep shares of the other company on your personal ledger and to insert a price directive assessing the approximate value per share based on the quarterly balance sheet of the company.


I've kind of decided to keep each legal entity in its own bean project.  I reflect advances to/from owner (there tend to be a lot of those) on each project as well as a "liquidation" value of the company (owner-equity). 

You can use tags and A/R to extract expenses to be repaid from the companies. I used to do that back in the day, it made it possible to track spending from a personal credit card for a business expense (overall I think it's worth keeping separate credit cards though, less work or possibility for confusion).


Then I might just write some scripts to generate owner-equity/share as a price directives into my personal reports.  And keep the personal bean simple.

Yep!

 
  The downside is my personal rolled up reports hide a fairly large part of my passive income/expenses.  I can't answer a simple question like "What % of my investments are in Real Estate." or "How much money did I make from AirBnB" without perhaps combining multiple entities.  This seems like a reporting issue and not a structural one though--and could be solved at a higher reporting level.  Anyone have experience with this?

I don't, but I'm pretty confident if you keep books in Beancount for the other entities you should be able to script it.
(That's the joy of text files and power over your data!)

  
On the horizon I'd like to write:

- Google Sheets Round-Trip.  Be able to export simple transactions to a sheet, let my assistant tag them and read them back in.  I saw some work on reporting to Google Sheets in the project already.

On the outbound side, I do this routinely:
The essence of the method is to 
a. pull a bunch of tables from Beancount, using metadata
b. join those datasets into a single table
c. export the table to a sheet in an existing google sheets doc
There's a single row for each posting. This updated data sheet is never edited manually (it gets overwritten by the script).

On the inbound side, I do a lot of these with my wife. It works reasonably well (I always have to review the sheets doc and do a few manual fixups but it's pretty light).
See this:
and (again) this:

There's generic CSV-to-sheets doc code here:
There's some sheets-to-Beancount code here (I used this):
I think this last code could be made more general, e.g. given "some" sheet without an expected format, one could automatically detect from the data which column is for what (date, account, description, amount, etc.) and convert into a Beancount file.
Eventually, these libraries will get removed and this will be incorporated into the SQL query language, which will support sources and sinks of various types, including google sheets.

 
- Crypto -- Haven't even started thinking about this, though saw a few importers online.  For now I just make note of the balances.

Like I mentioned, if you're going to hold these things at cost you can use the automated booking (e.g. FIFO) to spend, but in practice it may not reflect the real activity in the account (you'd have to match the booking algorithm of CoinBase or whatever other platform you're using).  It's far from perfect. Best is to set aside some coins and spend them the way you would a currency and separate the currency usage from the speculation role (please don't call it "investing"), but then you won't have perfect / total returns. Not a huge deal though (IMO), it would be the same as if you were investing in ISK or EUR, you'd have a cash account attached to the investing account, but crypto fans tend to be maniacs about this.


- VIM -- I've started some tools on this front, would like to be able to do most common tasks from within vim, like merging files, applying rules, and intelligently modifying entries.  Seems that most of the python hooks are already in beancount (like parsing single entry etc.). I'm already using the existing VIM plugin, but would like to do so much more.

Is VIM still around? *giggles*
 
 
I'll release some of the code on github if anyone is interested in any of these sorts of features, but would want to make sure it's a complete usable workflow first.

And Martin, if there's anything I can help with let me know, though I keep waiting for you to switch to GitHub!

The plan is to hold off as long as possible and do it all at the last minute, hoping for a big green button I can push (or run a script) and have all my tickets move over.
Looks like July 1st is the new deadline now.
Still not 100% decided between Git and Heptapod but after all the complaining probably Git/Github (better just me complaining than 50% of the community?).

About future development: for a good little while now I've refrained from changing anything too serious in the name of stability, but I've had some real annoyance around performance (it's too slow) and clear ideas about how to fix this and what the next version ought to be like that keep being confirmed. I sketched some of these in a thread about a year ago:
I'm writing a doc now with the goals for the next version, outlining all the items to address.

There have been some quiet developments in the background recently. The main thing that had been holding me back to slide into a v3 rewrite is to setup a solid & stable build platform for a C++ version with a reliable build system with minimal and versioned (fixed) dependencies, something for the ages, something that won't break regularly, and a severe lack of time (I've been working like an animal the past couple of years). But I've come up with another weird idea recently and I've been building another DSL over the last couple weekends, and that one requires a fast parser and a similar base of C++/Python/protobuf/UTF8 that I want for Beancount and I think I've resolved all the main hurdles in the process of doing that. A Bazel build for Beancount is in the works.

 

 
Best,

Runar

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.

Martin Michlmayr

unread,
May 18, 2020, 2:47:31 AM5/18/20
to bean...@googlegroups.com
* Runar Petursson <ru...@runar.net> [2020-05-17 11:00]:
> My real mental block was around how to organize my beans. Single file?

My impression is that most people here have a single file or one file
per year.

I split per year and also per account. I have one file for each
account and then I have one journal for cash expenses. This works
for me and I don't see the attraction of putting everything into
a single file.

> Where do I put new transactions?

At the end of the respective file in my case... really easy.

All individual files have to be date ordered and I use the
"file_ordering.py" plugin to verify that's the case:
https://github.com/zacchiro/beancount-plugins-zack/

I found a number of date errors because of this already. (Usually cut & paste errors)

> What about other entities (wholly owned companies, partially owned companies).

You'd probably show them at cost. In theory you could do a
consolidation according to IFRS, but do you really want to go
there...?

> How would I track passive income, trading income etc.

It's just income for beancount. I guess what you're really asking
here is what kind of reports would make sense on a *reporting* level.
e.g. passive income as a percentage of total income, passive income
as a percentage of your target annual income for financial
independence, etc.

I think that's an interesting idea and I think some fava plugins would
be best for that. Maybe additions to https://github.com/redstreet/fava-investor

> I was able to start thinking about auto-tagging/matching in a more
> robust fashion. To that regard I'm building a regular expression
> rules engine. This is a similar path that many people seem to have
> gone down. Finding the magic sweet-spot between learning from
> existing transactions, having matching rules and a neural network/AI
> :-). I'm quite simple, so I just read a "rules.yaml" and modify the
> existing entries, applying the account modifications.

For the record, I've recently developed a similar "rules-importer" for
Software in the Public Interest, Inc, a non-profit that moved from
ledger to beancount. Runar's code does pretty much the same my code
does. It would probably be best to agree on one code base and get
this into into beancount.

> I also haven't quite figured out how best to handle assets held in
> companies. While it seems obvious that a separate legal entity
> needs its own set of books (and files its own taxes), I have several
> companies where I'd like to integrate portions of the balance
> sheet/expenses into my personal bean file. An example is a legal
> entity that manages an AirBnB and owns the property.

You could look at how consolidations are done in IFRS (or another
accounting standard).

I wonder if a plugin could be written for that, hmm...

> - VIM -- I've started some tools on this front, would like to be
> able to do most common tasks from within vim, like merging files,
> applying rules, and intelligently modifying entries. Seems that
> most of the python hooks are already in beancount (like parsing
> single entry etc.). I'm already using the existing VIM plugin, but
> would like to do so much more.

Sounds interesting. (And yes, Martin, please still use vim ;)

Aamer Abbas

unread,
May 18, 2020, 8:59:59 AM5/18/20
to Beancount
Here's a sample of my directory structure for inspiration. Might be overkill for some, but it really works well for me.

At the top level I have:
  • documents - for pdfs, etc. The subdirectory structure mirrors the account structure.
  • example_queries.txt - Some sample queries I use for various data aggregation. Just for my own reference.
  • importers - This includes the python code for extractors and filers
  • includes - I have all the real data here. accounts.beancount and commodities.beancount only have the declaration statements, not any real data. events.beancount has life events (travel, marriage, child birth etc).
    • the journals subfolder has the actual journal data
    • past_years is where i archive files at the end of the year and start with fresh, blank files
    • prices.beancount has the price records for stocks, home value, etc
  • personal.beancount - has "option" and "plugin" declaration. The rest of the file just does an "include" on all the files in the includes subfolder
  • personal.import - has the config for extractors/filers
  • plugins - this subfolder has some 3rd party plugins I use and some I've written myself
  • price_sources - I have some code here that gets data for crypto (https://github.com/aamerabbas/beancount-coinmarketcap)

.
├── README
├── documents
│   ├── ...
├── downloads
├── example_queries.txt
├── importers
│   ├── __init__.py
│   ├── capital_one_card_extract
│   │   ├── ...
│   ├── capital_one_card_file
│   │   ├── ...
├── includes
│   ├── accounts.beancount
│   ├── commodities.beancount
│   ├── events.beancount
│   ├── journals
│   │   ├── banks
│   │   │   ├── wells_fargo.beancount
│   │   │   └── ...
│   │   ├── cards
│   │   │   ├── amex.beancount
│   │   │   └── ...
│   │   ├── cash.beancount
│   │   ├── crypto.beancount
│   │   ├── gift_cards.beancount
│   │   ├── pending.beancount
│   │   ├── real_estate.beancount
│   │   ├── retirement.beancount
│   │   ├── rsu.beancount
│   │   └── stocks.beancount
│   ├── past_years
│   │   └── 2019
│   │       ├── events.beancount
│   │       ├── journals
│   │       │   ├── banks
│   │       │   │   ├── wells_fargo.beancount
│   │       │   │   └── ...
│   │       │   ├── cards
│   │       │   │   ├── amex.beancount
│   │       │   │   └── ...
│   │       │   ├── cash.beancount
│   │       │   ├── crypto.beancount
│   │       │   ├── gift_cards.beancount
│   │       │   ├── pending.beancount
│   │       │   ├── real_estate.beancount
│   │       │   ├── retirement.beancount
│   │       │   ├── rsu.beancount
│   │       │   └── stocks.beancount
│   │       └── prices.beancount
│   └── prices.beancount
├── personal.beancount
├── personal.import
├── plugins
│   ├── __init__.py
│   ├── validate_unused_accounts.py
│   └── ...
└── price_sources
    ├── __init__.py
    └── coinmarketcap.py

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.

Runar Petursson

unread,
May 18, 2020, 11:05:39 PM5/18/20
to bean...@googlegroups.com
On Sun, May 17, 2020 at 11:59 PM Martin Blais <bl...@furius.ca> wrote:
On Sat, May 16, 2020 at 11:01 PM Runar Petursson <ru...@runar.net> wrote:
Hey Everyone (and esp Martin!),

After over 10 years on my TODO list I've finally gotten around to migrating to Beancount.  It's Awesome!  I'm glad I waited from those early versions, as the product has really come along and is a pleasure to work with.

Whoa a blast from the past

A bit of history for everyone's benefit: Runar is the friend who introduced me to Quicken. We were colleagues at a company in California back and during a visit with some friends dropping by his place, he happened to be finishing updating his accounts on his personal computer and showed me how he was doing his bookkeeping using Quicken. This was the first mention ever I heard of "double-entry accounting."  That demo was my first encounter with the methods. Without that chance demo I may have never walked this far down the bookkeeping rabbit hole...

Runar, I still remember vividly in that discussion 13 years ago, you explained you were "using Quicken but it wasn't the right thing to do, that you ought to be using double-entry accounting instead."  I didn't understand anything at the time and I got curious so I looked into it. This led me to try to replace my various hopeless incomplete spreadsheets and text files, and I found John Wiegley's Ledger, which is the original inspiration for a text-based accounting system. And then I had some ideas that required Python so I built Beancount v1t a, which at first was compatible with Ledger syntax and a few years thereafter I redesigned the syntax and rewrote the whole thing as Beancount v2 (today's version). Beancount v3 is in the works (more below).
 
Haha, I do as well.  I think the discussion on the elegance of the accounting formula and the "Equity" account starting at zero struck a chord.
 

Of course, I've fallen into most of the newbie traps, and will have to iterate the workflow the hard way.  I'd like to describe my envisioned workflow, so someone can stop me before I go too far down the rabbit hole (though I'm already a week in).

I have about 20 years worth of various data I'd eventually like to import/organize from various sources.  These have existing Account tags and Categories etc..  For now, I'm starting with 2020. 

I think that's the right approach. Are you still using Quicken or have migrated to something else? There exist converters out there for a variety of formats, made by others. I think the best way to get going is to export in a parseable format and write a converter script.
See this doc in which I list all the contributions I've come across and which are posted on this list, there might be something that fits the bill:

One problem you will encounter for sure if that those 20 years will become slow to process. I'm in that situation too and I'm going to address this with a C++ rewrite (more at the end of this email).

I eventually migrated to GnuCash but at some point during my last startup I fell too far behind.  Now I'm trying to pick up the pieces and have a 7 year gap.  I am worried about the performance, seems like a lot of things for historic data could be processed once and pickled/stored without re-evaluating.
 


Existing data is coming from sources including tons of OFX Files, Excel sheets, CSV,  Google Sheets, PDF's, various trading exchanges and a bunch of crypto stuff.  I'll tackle them one at a time, as those workflows are fairly well defined and you have great existing tooling.  I'm using the full bean-file workflow.

Importers are easy and intuitive.  I've written a new IB importer based on the great ibflex library.  I've also rolled an OFX importer based on ofxtools, which were both so robust that they worked almost immediately on all of my banks.  The thing I'd like to explore here is configuring the importer on the account 'open' meta as well as rules integration.  I think the current 'import.py' has already become difficult to manage as I have too many accounts.

Thanks for the success reports. I haven't looked at ofxtools in a while and I wasn't aware of ibflex (I have to do that eventually, I mainly use another broker but started investing in commodities in IB, so far reconciling that one by hand). With all the regulations it seems IB is the only broker that offers a decent solution when moving between countries.

I still have some work to do on the importer, so far I've focused on the trades and not so much on the 20 different types of subtle cash modifications/sweeps/dividends they do.  I just use a cash padding directive for that.  

My real mental block was around how to organize my beans.  Single file?

That's what I do and recommend. I use beancount major mode and outline minor mode in Emacs to get around the file, and lots of i-search.
Many people here use multiple files but make sure to define all your options in the top-level file.

 
Where do I put new transactions? 

Depends on the account type. I tend to organize them in sections by account. See the large auto-generated example file in the source code.
e.g. a credit card will have a dedicated section. Payments to it remain in the checking account (you have to choose one side).
(v3 will likely offer a better solution for keeping all of a transaction's postings together.)

 
What about staging transactions from imports? 

I copy-paste them manually. You can come up with a system (e.g., insert a tag and have your script automatically insert the imported transaction in your Ledger).

 
Auto-match/tag/payee? 

That's really quite custom and you should write a plugin. Beancount doesn't include something reusable for this yet. Do that in your importer (have your importer load the contents of your ledger if desired).


What about other entities (wholly owned companies, partially owned companies). 

I would use separate Ledgers for these. I go further: I even keep a separate Ledger for my kid's expenses.
I outline some methods and tools here:
Overall you'll have to design an account system that works the way you prefer. 
Techniques involve:
- Keep a running account in your personal ledger for transfers and accruals with your other entities
- Use scripts to ensure a subset of transactions match between ledgers
- Import from and export to spreadsheets

This is very helpful.  All I can say is, CONGRATS on being a dad :-)
 
 
How would I track passive income, trading income etc.  There seem to be about as many workflows as users.

For very large number of trades, you may find the system insufficient. I'd done some motions toward adding metadata to postings in order to be able to pick up buy/sell pairs in the past, here: https://bitbucket.org/blais/beancount/src/tip/beancount/plugins/book_conversions.py  But this was done before the current booking code implementation. The v3 booking code will insert a reference to the augmenting posting from each reducing posting, and this should make it easy to scan the stream of reductions and join both legs to produce a table of trades (and a function will be provided that does that).

For crypto, a lot of people are asking questions on this list, refer to the list, but you will find that it's difficult to use with Beancount if you want to think of those things *both* as currencies to spend and as investment simultaneously. I think tracking some set of coins as investments, and another set of coins to be spent, separately, fits the current model better. If crypto ever actually takes one (the criteria being: people actually use them routinely to buy and sell real goods & services, not speculate) I'll have to review the design so that the routine "sell investment to buy goods" scenario is easier to input and book.

I agree it depends on the coin.  BTC, ETH and a few others I look at as currencies in their own right and treat them as such. It makes no more sense to track lots or PnL against BTC/USD than it does CAD/USD.  I'm as worried about USD exposure as I am EUR or BTC.  I'm happy to just track the balance-sheet in USD over time.  There are some exceptions to this, through derivative markets, but those tend to be easy to track in lots anyway (with their own symbol).  Other Coins I would have very few buy/sell touch points and might be tempted to track in lots.

 

 
The general structure I've decided on is to have a yearly bean file-- "2020.bean", linked from a root file.  The root file has all of the accounts 'open' and 'close', common commodities, options and all includes.  This file is manually managed.  There are some additional files (prices, trading transactions), but they tend to be verbose and simple formats.

My yearly file is "round-trippable" in that I am able to read it, insert/modify entries and rewrite it using a simple CLI tool. 

Make sure NOT to use the printer to output back transactions that have been read, because the parsing & plugins processing may insert a lot of details you don't want to have reproduced in the source. Work with the source. (The separation between parsed source and processed stream of directives will be more radical and better delineated in the next version.)
ok, I'll make note of this.  So far I'm using the printer and almost no plug-ins, but I'm guessing this will bite me.  Thanks.
 

 
This allows me to inject new transactions in the right place from the importers.  The tradeoff is I have a fixed sort order (mostly by date/type) and no comments.  Because it's generated though, I can inject Folds which make the file quite easy to navigate.  My big insight here was that I'd think of my year in chronological order and not on a per account basis (much like beancount internally).

This file becomes my "Personal Journal" and I am mostly working at the end of the file (today).  I also make use of transaction flags, so that once it's got a '*', I don't auto-do anything.  Diff/git/balance are my friends for any batch changes.  Are we free to add new flags--I've seen some conflicting assertions on that.

Not at the moment. Only a small subset of characters are supported
Current limitation is due to syntax definition and lexer implementation.
(v3 will have a better story around the flag syntax, this need to be tightened. Use metadata. v3 will also have a UTF8 parser beyond the strings - I've already prototyped something.)
 
Noted.  I'll stick with the defined flags at least then, '!' vs '*' seem safe. 
This is great, I'll try it as a starting point.
 
 
- Crypto -- Haven't even started thinking about this, though saw a few importers online.  For now I just make note of the balances.

Like I mentioned, if you're going to hold these things at cost you can use the automated booking (e.g. FIFO) to spend, but in practice it may not reflect the real activity in the account (you'd have to match the booking algorithm of CoinBase or whatever other platform you're using).  It's far from perfect. Best is to set aside some coins and spend them the way you would a currency and separate the currency usage from the speculation role (please don't call it "investing"), but then you won't have perfect / total returns. Not a huge deal though (IMO), it would be the same as if you were investing in ISK or EUR, you'd have a cash account attached to the investing account, but crypto fans tend to be maniacs about this.


- VIM -- I've started some tools on this front, would like to be able to do most common tasks from within vim, like merging files, applying rules, and intelligently modifying entries.  Seems that most of the python hooks are already in beancount (like parsing single entry etc.). I'm already using the existing VIM plugin, but would like to do so much more.

Is VIM still around? *giggles*
 
 
I'll release some of the code on github if anyone is interested in any of these sorts of features, but would want to make sure it's a complete usable workflow first.

And Martin, if there's anything I can help with let me know, though I keep waiting for you to switch to GitHub!

The plan is to hold off as long as possible and do it all at the last minute, hoping for a big green button I can push (or run a script) and have all my tickets move over.
Looks like July 1st is the new deadline now.
Still not 100% decided between Git and Heptapod but after all the complaining probably Git/Github (better just me complaining than 50% of the community?).

About future development: for a good little while now I've refrained from changing anything too serious in the name of stability, but I've had some real annoyance around performance (it's too slow) and clear ideas about how to fix this and what the next version ought to be like that keep being confirmed. I sketched some of these in a thread about a year ago:
I'm writing a doc now with the goals for the next version, outlining all the items to address.

There have been some quiet developments in the background recently. The main thing that had been holding me back to slide into a v3 rewrite is to setup a solid & stable build platform for a C++ version with a reliable build system with minimal and versioned (fixed) dependencies, something for the ages, something that won't break regularly, and a severe lack of time (I've been working like an animal the past couple of years). But I've come up with another weird idea recently and I've been building another DSL over the last couple weekends, and that one requires a fast parser and a similar base of C++/Python/protobuf/UTF8 that I want for Beancount and I think I've resolved all the main hurdles in the process of doing that. A Bazel build for Beancount is in the works.

Very interesting.  I doubt I can contribute much on the C++ front, but am a happy tester!
 
 
Best,

Runar

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CACqcuWsWpqdVLdQs5m9Rkbv-NeCKMDySN5xPUQ8%3D6XRkO%3DPpsQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.

Martin Blais

unread,
May 18, 2020, 11:44:45 PM5/18/20
to Beancount
On Mon, May 18, 2020 at 11:05 PM Runar Petursson <ru...@runar.net> wrote:


On Sun, May 17, 2020 at 11:59 PM Martin Blais <bl...@furius.ca> wrote:
On Sat, May 16, 2020 at 11:01 PM Runar Petursson <ru...@runar.net> wrote:
Hey Everyone (and esp Martin!),

After over 10 years on my TODO list I've finally gotten around to migrating to Beancount.  It's Awesome!  I'm glad I waited from those early versions, as the product has really come along and is a pleasure to work with.

Whoa a blast from the past

A bit of history for everyone's benefit: Runar is the friend who introduced me to Quicken. We were colleagues at a company in California back and during a visit with some friends dropping by his place, he happened to be finishing updating his accounts on his personal computer and showed me how he was doing his bookkeeping using Quicken. This was the first mention ever I heard of "double-entry accounting."  That demo was my first encounter with the methods. Without that chance demo I may have never walked this far down the bookkeeping rabbit hole...

Runar, I still remember vividly in that discussion 13 years ago, you explained you were "using Quicken but it wasn't the right thing to do, that you ought to be using double-entry accounting instead."  I didn't understand anything at the time and I got curious so I looked into it. This led me to try to replace my various hopeless incomplete spreadsheets and text files, and I found John Wiegley's Ledger, which is the original inspiration for a text-based accounting system. And then I had some ideas that required Python so I built Beancount v1t a, which at first was compatible with Ledger syntax and a few years thereafter I redesigned the syntax and rewrote the whole thing as Beancount v2 (today's version). Beancount v3 is in the works (more below).
 
Haha, I do as well.  I think the discussion on the elegance of the accounting formula and the "Equity" account starting at zero struck a chord.
 

Of course, I've fallen into most of the newbie traps, and will have to iterate the workflow the hard way.  I'd like to describe my envisioned workflow, so someone can stop me before I go too far down the rabbit hole (though I'm already a week in).

I have about 20 years worth of various data I'd eventually like to import/organize from various sources.  These have existing Account tags and Categories etc..  For now, I'm starting with 2020. 

I think that's the right approach. Are you still using Quicken or have migrated to something else? There exist converters out there for a variety of formats, made by others. I think the best way to get going is to export in a parseable format and write a converter script.
See this doc in which I list all the contributions I've come across and which are posted on this list, there might be something that fits the bill:

One problem you will encounter for sure if that those 20 years will become slow to process. I'm in that situation too and I'm going to address this with a C++ rewrite (more at the end of this email).

I eventually migrated to GnuCash but at some point during my last startup I fell too far behind.  Now I'm trying to pick up the pieces and have a 7 year gap.  I am worried about the performance, seems like a lot of things for historic data could be processed once and pickled/stored without re-evaluating.

There's already a pickle cache. It helps a lot, especially for web browsing, but as soon as you change the input file you have to pay the cost again.
I see no reason that the C++ rewrite shouldn't process large files instantly. This would allow some cool new ideas in Emacs about editing with interactive rendering of the contents of accounts before / after a transaction you're editing (like bean-doctor context, but updated instantly as you type).
Thanks!

 
 
 
How would I track passive income, trading income etc.  There seem to be about as many workflows as users.

For very large number of trades, you may find the system insufficient. I'd done some motions toward adding metadata to postings in order to be able to pick up buy/sell pairs in the past, here: https://bitbucket.org/blais/beancount/src/tip/beancount/plugins/book_conversions.py  But this was done before the current booking code implementation. The v3 booking code will insert a reference to the augmenting posting from each reducing posting, and this should make it easy to scan the stream of reductions and join both legs to produce a table of trades (and a function will be provided that does that).

For crypto, a lot of people are asking questions on this list, refer to the list, but you will find that it's difficult to use with Beancount if you want to think of those things *both* as currencies to spend and as investment simultaneously. I think tracking some set of coins as investments, and another set of coins to be spent, separately, fits the current model better. If crypto ever actually takes one (the criteria being: people actually use them routinely to buy and sell real goods & services, not speculate) I'll have to review the design so that the routine "sell investment to buy goods" scenario is easier to input and book.

I agree it depends on the coin.  BTC, ETH and a few others I look at as currencies in their own right and treat them as such. It makes no more sense to track lots or PnL against BTC/USD than it does CAD/USD.  I'm as worried about USD exposure as I am EUR or BTC.  I'm happy to just track the balance-sheet in USD over time.  There are some exceptions to this, through derivative markets, but those tend to be easy to track in lots anyway (with their own symbol).  Other Coins I would have very few buy/sell touch points and might be tempted to track in lots.

 

 
The general structure I've decided on is to have a yearly bean file-- "2020.bean", linked from a root file.  The root file has all of the accounts 'open' and 'close', common commodities, options and all includes.  This file is manually managed.  There are some additional files (prices, trading transactions), but they tend to be verbose and simple formats.

My yearly file is "round-trippable" in that I am able to read it, insert/modify entries and rewrite it using a simple CLI tool. 

Make sure NOT to use the printer to output back transactions that have been read, because the parsing & plugins processing may insert a lot of details you don't want to have reproduced in the source. Work with the source. (The separation between parsed source and processed stream of directives will be more radical and better delineated in the next version.)
ok, I'll make note of this.  So far I'm using the printer and almost no plug-ins, but I'm guessing this will bite me.  Thanks.

The thing to keep in mind is that after parsing, the plugins can alter transactions significantly. 
Printing will output the resulting modified transactions.

 

Justus Pendleton

unread,
May 19, 2020, 9:44:28 AM5/19/20
to Beancount
On Sunday, May 17, 2020 at 10:01:12 AM UTC+7, Runar Petursson wrote:
My real mental block was around how to organize my beans.  Single file? Where do I put new transactions?  What about staging transactions from imports?  Auto-match/tag/payee?  What about other entities (wholly owned companies, partially owned companies).  How would I track passive income, trading income etc.  There seem to be about as many workflows as users.

I have things split into multiple files. When I first started I saw that beancount supported that and it appealed to my OCDness along with vague "well, I wouldn't put all my source code in a single file..." feelings.

I mostly regret it. Beancount has a few admittedly work-aroundable unresolved bugs around supporting multiple files (like plugins having to be defined in the main file; options have to be redefined in every sub-file; possibly others since per Martin's own statement "File includes were bolted on as per a quick request a long time ago") and most 3rd party tooling definitely doesn't understand the concept and won't work as expected. The sub-files have no reference to the containing file. So any file-centric tooling will struggle to figure out what is going on.

As an example: say you have main.bean, accounts.bean, commodities.bean, prices.bean, and balance-statements.bean. If you open up balance-statements.bean in an editor it won't find any open/close statements and won't be able to provide any tooling support. Open up prices.bean and it won't find any commodity statements.

Worse, you have to understand enough about beancount to know that this kind of multiple-file approach is a bad idea and why your editor isn't working the way you think it should. Luckily, beancount makes thing easy to move around, so this isn't exactly the end of the world.

There is something theoretically appealing about having each automated/external data source going into its own file. Yahoo prices quote go here; IBKR imports go there. But it doesn't actually matter in practice and the downside is running into some corner case issue in beancount or external tooling with multiple files.

In practice, I think just using code folding is easier & better than splitting things into multiple files. If there were a "smart" beancount file sorter (that kept comments and whatnot) it would make this approach even better. I've been meaning to write something like that for a long time and just....haven't.

Patrick Ruckstuhl

unread,
May 19, 2020, 10:01:11 AM5/19/20
to bean...@googlegroups.com

I'm splitting up things quite a bit and it works quite well for me.

I have lots of automatic price updates which get written into an own file each and also most importers use their own file.

Fava works very well with multiple files and it has support for automatically placing new transactions in the correct file.


Regards,

Patrick

Reply all
Reply to author
Forward
0 new messages