Re: Getting started; assigning accounts to bank .csv data

Martin Blais

unread,

Feb 1, 2016, 11:41:25 PM2/1/16

to ledger-cli, Beancount

On Mon, Feb 1, 2016 at 1:13 PM, John Hendy <jw.h...@gmail.com> wrote:

Greetings,

It's a fresh year and I've been seeing ledger come up on the Org-mode
mailing list for some time and decided to give it a try. I'm coming
from Moneydance and just wanted to get away from the tedious GUI
method of adding information, as well as have flexibility to generate
my own reports/visualizations with python or R, etc. [1]

Consider that I'm about a week into reading through docs here and
there during evenings. My first step was going to be importing a
downloaded .csv from my bank to get started. I'm still trying to
verify I get the terminology, so I'll use this from the manual:

From 5.1 Basic format:
```
This transaction has a date, a payee or description, a target account
(the first posting), and a source account (the second posting). Each
posting specifies what action is taken related to that account.
```

From 7.2.1.2 The convert command:
```
The fields ledger can recognize contain these case-insensitive strings
date, posted, code, payee or desc or description, amount, cost,total,
and note.
```

For my purposes, I import my finances primarily to "categorize" (what
I believe here is called adding an account) and assign a payee so that
I can track my spending against a budget. So, I'm surprised there's no
special column keyword I can add for "account". It appears that all I
can do is pass, say, `--account "assets:checking"` to have ledger know
it's against assets:checking? Is that correct?

From trying to google "import csv account ledger" or similar
variations, I've been surprised that the only tools to do something
like this appear to be interactive one-trans-at-a-time programs like
icsv2ledger and reckon (granted, they can learn or follow rules). I
could quickly go through my bank's .csv and add exp:food:dining,
exp:auto:fuel to my ~100 transactions a month and have those imported
just like the other column data.

Keep in mind that part of the process of importing (they like call it "reconciling") involves

- Manually reviewing the transactions for correctness or fraud

- Merging new transactions with previous transactions imported from the other side (e.g. a payment from a bank account to pay off on'es credit card will typically be imported from both the bank AND credit card accounts; you must merge the corresponding transactions together)

- Assigning the right category (you can automate this with a script I suppose; frankly it's not much work, I do all of mine manually with the help of auto-completion from Emacs, which is the most important feature IMO)

- Moving the resulting transactions to the right place in your file.

- Verifying balances visually, or inserting a balance directive which asserts what the final account balance should be (for correctness) after the new transactions.

If you do it often enough and you have editing chops, you get used to the dance and it's a breeze.

I think the fourth step can be hypothetically solved using heuristics.

I feel like I must be missing something with respect to getting the
from/to accounts added to the bank data.

Perhaps to take a step back...
- are the majority of folks writing their transactions by hand in ledger format?

Can't say about others, but for me I want to say that about half the importing is semi-automatic.

- Credit cards and banks import from downloads but I need to categorize manually (as described above), fairly good quality downloads.

- Investment accounts fully automated buys but I need to manually edit sales in some accounts. Great quality of downloads.

- Payroll stubs and vesting and a few other things are provided only as PDFs and I don't bother trying to extract (though I've made some headway towards this, it's incomplete; it turns out fully automating table extraction from PDF isn't trivial. The best OSS solution is TabulaPDF by far but you still need to manually identify where the table is).

- Cash transactions: I have to enter those by hand. I only book non-food expenses as individual transactions directly, and for food maybe once every six months I'll count my wallet balance and insert one transaction per month to debit away the cash account toward food. If you do this, you end up with surprisingly little transactions to book manually, maybe a few/week. I suppose it could depend on lifestyle choices.

It takes me less than 1 hour/week to run through the active accounts, usually first thing Saturday morning when I get up. Most of the pain is logging with user/passwords into the various institutions and clicking the right buttons to generate the downloaded files. Extraction and filing is automated using importers I wrote against LedgerHub. Less active accounts are updated every quarter or when I feel like it.

- is there some better way to import bulk data (e.g. via ledger's
convert function) and post-edit once it's in ledger format? It seemed
a .csv in LO calc was pretty convenient vs. scrolling through a long
text file

- any other pointers along the above lines would be most welcome.

Check out LedgerHub for ideas.

Original design doc:

http://furius.ca/ledgerhub/doc/design

Post-mortem:

http://furius.ca/ledgerhub/doc/postmortem

The project is being killed right now, rewritten much better and simpler and migrated into the Beancount project; if you do end up looking at the code make sure you're checking out the "stable" branch, it's a bit of a riot on the default branch right now, it will be broken.

Essentially, I'm defining a config (in Python) as a list of "importer" objects and boil the process down to three steps:

1. Identify: Given a messy list of downloaded files (e.g. in ~/Downloads), automatically identify which importer is supposed to handle them

2. Extract: Extracting transactions and statement date from each file, if possble

3. File: Filing away the downloads to a directory hierarchy which mirrors the chart of accounts, for preservation, e.g. in a personal git repo.

You could think of adding

0. Fetch: Automatically download the files

but that's too hard. Personally I just don't have the stamina to implement this for myself. Given the nature of today's websites and the castles of JavaScript used to implement them, this would be a nightmare to implement for too little payoff. I love the idea of full automation, but I just don't have the time. Note that if you don't mind the nature of their business (they sell your data), you could potentially try to use Yodlee to pull much of it from a single place.

In any case, you can't really get away without writing at least some code--it's just not realistic, the inputs from different people vary too much. There's very little shared code out there (just basic codes for CSV files, like the ones you mention) but too few users that share the same accounts to generate the critical mass needed for reuse. A while back I created the LedgerHub project to host shared importer code and provide a framework for doing the above, but never received much contributions and honestly I didn't put the care and quality attention to it I should have. More importantly, regression testing for those importers is most easily carried out using actual downloaded files compared to a corresponding expected output, but these files don't share well (they contain lots of personal data) so one ends up with two repositories anyhow. And besides there are several design decisions in some importers that may not please every user, in particular about how you choose your accounts for investments (there are degrees of freedom), so even sharing is not entirely an obvious win.

By the way, I've found that regression testing is the _key_ to maintaining your importer code, because those importers are often written against file formats with no official spec and unexpected surprises show up routinely (e.g. I have XML files with some unescaped "&" characters, which require a custom fix "just for that bank", for instance, lots of nasty surprises), so you really need to be able to reproduce your tests. I think I have to make at least _some_ fix to an importer about once/month, and that sinks maybe a half-hour (involves adding the new file which makes it break, fix the importer code, and potentially update the older expected files for changes).

I hope this helps give some color to the process,

I tried to search the list for more of this sort of question, so
forgive me if I've missed something. Replying with links pointing me
in the right direction would be plenty sufficient if this has already
been discussed!

Thanks!
John

[1] http://moneydance.com/

--

---
You received this message because you are subscribed to the Google Groups "Ledger" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ledger-cli+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

John Hendy

unread,

Feb 2, 2016, 10:48:08 PM2/2/16

to Ledger, bean...@googlegroups.com

Thanks for the awesome reply!

Keep in mind that part of the process of importing (they like call it "reconciling") involves
- Manually reviewing the transactions for correctness or fraud

I'll get there. For better or worse, I take the downloaded bank .csv as "truth" and am mostly interested in getting a better handle on what my money is used on, budgeting, planning, etc.

- Merging new transactions with previous transactions imported from the other side (e.g. a payment from a bank account to pay off on'es credit card will typically be imported from both the bank AND credit card accounts; you must merge the corresponding transactions together)

Definitely. Moneydance allowed me to input an account, which would "link" the transaction. Then I'd have to delete or merge the other account's record of the same transaction.

- Assigning the right category (you can automate this with a script I suppose; frankly it's not much work, I do all of mine manually with the help of auto-completion from Emacs, which is the most important feature IMO)

Huh. Yes, I'll definitely have to look into the emacs mode. I assumed once it was in ledger format it would be *a lot* harder to navigate around vs. just doing it while it's already in a spreadsheet format.

- Moving the resulting transactions to the right place in your file.

I'll have to look into this more. I get that this is the ledger list... but is beancount different in this respect? From reading your docs, it sounded like beancount didn't care about order. Or are there other reasons (besides date) that one would have to move transactions around?

- Verifying balances visually, or inserting a balance directive which asserts what the final account balance should be (for correctness) after the new transactions.

If you do it often enough and you have editing chops, you get used to the dance and it's a breeze.
I think the fourth step can be hypothetically solved using heuristics.

I feel like I must be missing something with respect to getting the
from/to accounts added to the bank data.

Perhaps to take a step back...
- are the majority of folks writing their transactions by hand in ledger format?

Can't say about others, but for me I want to say that about half the importing is semi-automatic.
- Credit cards and banks import from downloads but I need to categorize manually (as described above), fairly good quality downloads.
- Investment accounts fully automated buys but I need to manually edit sales in some accounts. Great quality of downloads.
- Payroll stubs and vesting and a few other things are provided only as PDFs and I don't bother trying to extract (though I've made some headway towards this, it's incomplete; it turns out fully automating table extraction from PDF isn't trivial. The best OSS solution is TabulaPDF by far but you still need to manually identify where the table is).
- Cash transactions: I have to enter those by hand. I only book non-food expenses as individual transactions directly, and for food maybe once every six months I'll count my wallet balance and insert one transaction per month to debit away the cash account toward food. If you do this, you end up with surprisingly little transactions to book manually, maybe a few/week. I suppose it could depend on lifestyle choices.

It takes me less than 1 hour/week to run through the active accounts, usually first thing Saturday morning when I get up. Most of the pain is logging with user/passwords into the various institutions and clicking the right buttons to generate the downloaded files. Extraction and filing is automated using importers I wrote against LedgerHub. Less active accounts are updated every quarter or when I feel like it.

This is a helpful time estimate/reference. My main account (checking) has ~100 transactions per month. I don't mind categorizing them myself, but I hoped for a quick-ish way to do that. Typing "expenses:blah:blah" is pretty fast in a spreadsheet. While I *use* emacs, I'm no navigation whiz, and going to the right place in a block of text to type the same thing seems super tedious vs. a spreadsheet. Hence I was puzzled that I couldn't use ledger's convert command to just bring in accounts from the .csv along with the rest. After all, all the dates and amounts are there, one can add payees... why not accounts?

- is there some better way to import bulk data (e.g. via ledger's
convert function) and post-edit once it's in ledger format? It seemed
a .csv in LO calc was pretty convenient vs. scrolling through a long
text file
- any other pointers along the above lines would be most welcome.

Check out LedgerHub for ideas.

Original design doc:
http://furius.ca/ledgerhub/doc/design

Post-mortem:
http://furius.ca/ledgerhub/doc/postmortem

The project is being killed right now, rewritten much better and simpler and migrated into the Beancount project; if you do end up looking at the code make sure you're checking out the "stable" branch, it's a bit of a riot on the default branch right now, it will be broken.

Essentially, I'm defining a config (in Python) as a list of "importer" objects and boil the process down to three steps:
1. Identify: Given a messy list of downloaded files (e.g. in ~/Downloads), automatically identify which importer is supposed to handle them
2. Extract: Extracting transactions and statement date from each file, if possble
3. File: Filing away the downloads to a directory hierarchy which mirrors the chart of accounts, for preservation, e.g. in a personal git repo.

You could think of adding
0. Fetch: Automatically download the files
but that's too hard. Personally I just don't have the stamina to implement this for myself. Given the nature of today's websites and the castles of JavaScript used to implement them, this would be a nightmare to implement for too little payoff. I love the idea of full automation, but I just don't have the time. Note that if you don't mind the nature of their business (they sell your data), you could potentially try to use Yodlee to pull much of it from a single place.

Yeah, not interested in that. It's not a big deal to download the few files I need.

In any case, you can't really get away without writing at least some code--it's just not realistic, the inputs from different people vary too much. There's very little shared code out there (just basic codes for CSV files, like the ones you mention) but too few users that share the same accounts to generate the critical mass needed for reuse. A while back I created the LedgerHub project to host shared importer code and provide a framework for doing the above, but never received much contributions and honestly I didn't put the care and quality attention to it I should have. More importantly, regression testing for those importers is most easily carried out using actual downloaded files compared to a corresponding expected output, but these files don't share well (they contain lots of personal data) so one ends up with two repositories anyhow. And besides there are several design decisions in some importers that may not please every user, in particular about how you choose your accounts for investments (there are degrees of freedom), so even sharing is not entirely an obvious win.

That's okay, and I'm cool with trying some code. I primarily use R for data analysis/plotting, but have started getting introduced to python via Coursera recently and hope to dig in more. That's another thing that attracts me to beancount :) That said, these are more just general questions at this point. I'm amazed at how much documentation there is... but for a total noob, I can say it's a bit intimidating and kind of hard to know where one should start! Not to mention having questions and not being sure you're even searching for the right terminology to answer your question.

By the way, I've found that regression testing is the _key_ to maintaining your importer code, because those importers are often written against file formats with no official spec and unexpected surprises show up routinely (e.g. I have XML files with some unescaped "&" characters, which require a custom fix "just for that bank", for instance, lots of nasty surprises), so you really need to be able to reproduce your tests. I think I have to make at least _some_ fix to an importer about once/month, and that sinks maybe a half-hour (involves adding the new file which makes it break, fix the importer code, and potentially update the older expected files for changes).

I hope this helps give some color to the process,

Definitely, and sincere thanks for taking the time to give me some pointers!

John

Martin Blais

unread,

Feb 2, 2016, 11:07:29 PM2/2/16

to ledger-cli, Beancount

BTW, there are some ideas around about automatically merging two incomplete transactions. This problem is the dual of solving the issue of settlement dates, i.e., the problem being that the dates of each of the two sides may settlement on different days.

See http://furius.ca/beancount/doc/proposal-settlement for some ruminations and scour the mailing-list, there is more discussion about this.

- Assigning the right category (you can automate this with a script I suppose; frankly it's not much work, I do all of mine manually with the help of auto-completion from Emacs, which is the most important feature IMO)

Huh. Yes, I'll definitely have to look into the emacs mode. I assumed once it was in ledger format it would be *a lot* harder to navigate around vs. just doing it while it's already in a spreadsheet format.

Definitely not, text is there for your pleasure. You typically organize your Ledger input file in the order that makes the most sense for you (minus some constraints: Ledger will report the transactions in the order they appear in the file and the balance assertions are computed as such. Beancount sorts everything by date so order doesn't matter).

- Moving the resulting transactions to the right place in your file.

I'll have to look into this more. I get that this is the ledger list... but is beancount different in this respect? From reading your docs, it sounded like beancount didn't care about order. Or are there other reasons (besides date) that one would have to move transactions around?

In Ledger, the reporting is done in file order. Balance assertions as well.

In Beancount, order is by date, so you don't have to care about how you organize them.

I think - but I'm not 100% sure - that most Ledger users must store their input file by section, and in each section in date order, to minimize the number of out-of-order transactions if they print out a register.

I use org-mode to create sections and each section is stored in date order for some subset of accounts.

- Verifying balances visually, or inserting a balance directive which asserts what the final account balance should be (for correctness) after the new transactions.

If you do it often enough and you have editing chops, you get used to the dance and it's a breeze.
I think the fourth step can be hypothetically solved using heuristics.

I feel like I must be missing something with respect to getting the
from/to accounts added to the bank data.

Perhaps to take a step back...
- are the majority of folks writing their transactions by hand in ledger format?

Can't say about others, but for me I want to say that about half the importing is semi-automatic.
- Credit cards and banks import from downloads but I need to categorize manually (as described above), fairly good quality downloads.
- Investment accounts fully automated buys but I need to manually edit sales in some accounts. Great quality of downloads.
- Payroll stubs and vesting and a few other things are provided only as PDFs and I don't bother trying to extract (though I've made some headway towards this, it's incomplete; it turns out fully automating table extraction from PDF isn't trivial. The best OSS solution is TabulaPDF by far but you still need to manually identify where the table is).
- Cash transactions: I have to enter those by hand. I only book non-food expenses as individual transactions directly, and for food maybe once every six months I'll count my wallet balance and insert one transaction per month to debit away the cash account toward food. If you do this, you end up with surprisingly little transactions to book manually, maybe a few/week. I suppose it could depend on lifestyle choices.

It takes me less than 1 hour/week to run through the active accounts, usually first thing Saturday morning when I get up. Most of the pain is logging with user/passwords into the various institutions and clicking the right buttons to generate the downloaded files. Extraction and filing is automated using importers I wrote against LedgerHub. Less active accounts are updated every quarter or when I feel like it.

This is a helpful time estimate/reference. My main account (checking) has ~100 transactions per month. I don't mind categorizing them myself, but I hoped for a quick-ish way to do that. Typing "expenses:blah:blah" is pretty fast in a spreadsheet. While I *use* emacs, I'm no navigation whiz, and going to the right place in a block of text to type the same thing seems super tedious vs. a spreadsheet. Hence I was puzzled that I couldn't use ledger's convert command to just bring in accounts from the .csv along with the rest. After all, all the dates and amounts are there, one can add payees... why not accounts?

You can probably script that away with a few rules.

I admit that 100 txns/month is more than I have, and I might look into auto-categorizing most of it myself if I were in that situation.

Problem is, everyone's little scripts appear to have little in common.

- is there some better way to import bulk data (e.g. via ledger's
convert function) and post-edit once it's in ledger format? It seemed
a .csv in LO calc was pretty convenient vs. scrolling through a long
text file
- any other pointers along the above lines would be most welcome.

Check out LedgerHub for ideas.

Original design doc:
http://furius.ca/ledgerhub/doc/design

Post-mortem:
http://furius.ca/ledgerhub/doc/postmortem

The project is being killed right now, rewritten much better and simpler and migrated into the Beancount project; if you do end up looking at the code make sure you're checking out the "stable" branch, it's a bit of a riot on the default branch right now, it will be broken.

Essentially, I'm defining a config (in Python) as a list of "importer" objects and boil the process down to three steps:
1. Identify: Given a messy list of downloaded files (e.g. in ~/Downloads), automatically identify which importer is supposed to handle them
2. Extract: Extracting transactions and statement date from each file, if possble
3. File: Filing away the downloads to a directory hierarchy which mirrors the chart of accounts, for preservation, e.g. in a personal git repo.

You could think of adding
0. Fetch: Automatically download the files
but that's too hard. Personally I just don't have the stamina to implement this for myself. Given the nature of today's websites and the castles of JavaScript used to implement them, this would be a nightmare to implement for too little payoff. I love the idea of full automation, but I just don't have the time. Note that if you don't mind the nature of their business (they sell your data), you could potentially try to use Yodlee to pull much of it from a single place.

Yeah, not interested in that. It's not a big deal to download the few files I need.

In any case, you can't really get away without writing at least some code--it's just not realistic, the inputs from different people vary too much. There's very little shared code out there (just basic codes for CSV files, like the ones you mention) but too few users that share the same accounts to generate the critical mass needed for reuse. A while back I created the LedgerHub project to host shared importer code and provide a framework for doing the above, but never received much contributions and honestly I didn't put the care and quality attention to it I should have. More importantly, regression testing for those importers is most easily carried out using actual downloaded files compared to a corresponding expected output, but these files don't share well (they contain lots of personal data) so one ends up with two repositories anyhow. And besides there are several design decisions in some importers that may not please every user, in particular about how you choose your accounts for investments (there are degrees of freedom), so even sharing is not entirely an obvious win.

That's okay, and I'm cool with trying some code. I primarily use R for data analysis/plotting, but have started getting introduced to python via Coursera recently and hope to dig in more.

R won't be fun for doing this. R makes it a huge pain to even do the kind of data cleaning necessary for prepping data for analysis. Definitely use Python over it, you'll save a lot of time. If you really need some specialzed R module, you can create numpy arrays in Python and there is a module that allows you to invoke the R runtime with these. Best of both worlds, but I doubt you'll need it.

Martin Blais

unread,

Feb 2, 2016, 11:09:39 PM2/2/16

to ledger-cli, Beancount

BTW, here's an auto-generated example file that looks similar to how I organize mine using org-mode:

https://bitbucket.org/blais/beancount/src/tip/examples/example.beancount

Erik Hetzner

unread,

Feb 3, 2016, 11:48:52 AM2/3/16

to ledge...@googlegroups.com, bean...@googlegroups.com, John Hendy

On Tue, 02 Feb 2016 19:48:06 -0800,
John Hendy <jw.h...@gmail.com> wrote:
>
>
>
> On Monday, February 1, 2016 at 10:41:26 PM UTC-6, Martin Blais wrote:
> >
> > - Merging new transactions with previous transactions imported from the
> > other side (e.g. a payment from a bank account to pay off on'es credit card
> > will typically be imported from both the bank AND credit card accounts; you
> > must merge the corresponding transactions together)
> >
>
> Definitely. Moneydance allowed me to input an account, which would "link"
> the transaction. Then I'd have to delete or merge the other account's
> record of the same transaction.

One needn’t actually merge these. Here is what I do:

2015/12/31 Credit card payment
Assets:Checking -$100
Transfer:AC->LCC $100

2016/01/01 Payment received
Liabilities:CreditCard $100
Transfer:AC->LCC -$100

Some kind person on the list pointed out this technique a while ago.

This makes import easier and allows for a difference in transit time. All
Transfer:* accounts should balance to $0, so you have an additional check that
everything is balancing out.

best, Erik
--
Sent from my free software system <http://fsf.org/>.

Martin Blais

unread,

Feb 3, 2016, 12:01:20 PM2/3/16

to Beancount, ledger-cli, John Hendy

I think it would be possible to process the stream of transactions and identify close matches based on common accounts and nearby dates, and automatically merge matching transactions into a single one, removing zero balances. The dual operation is assigning individual dates on postings of a single transaction and having the software split them up to obtain the result you describe.

In either case, matching transactions should be linked automatically and it should be possible to report on a set of matching transactions (a-la "bean-doctor linked").

John Hendy

unread,

Feb 3, 2016, 11:26:32 PM2/3/16

to ledge...@googlegroups.com, Beancount

On Tue, Feb 2, 2016 at 10:07 PM, Martin Blais <bl...@furius.ca> wrote:
> On Tue, Feb 2, 2016 at 10:48 PM, John Hendy <jw.h...@gmail.com> wrote:
>>
>> On Monday, February 1, 2016 at 10:41:26 PM UTC-6, Martin Blais wrote:
>>>
>>> On Mon, Feb 1, 2016 at 1:13 PM, John Hendy <jw.h...@gmail.com> wrote:
>>>>
>>>> Greetings,
>>>>

[snip]

>>
>>
>> Huh. Yes, I'll definitely have to look into the emacs mode. I assumed once
>> it was in ledger format it would be *a lot* harder to navigate around vs.
>> just doing it while it's already in a spreadsheet format.
>
>
> Definitely not, text is there for your pleasure. You typically organize your
> Ledger input file in the order that makes the most sense for you (minus some
> constraints: Ledger will report the transactions in the order they appear in
> the file and the balance assertions are computed as such. Beancount sorts
> everything by date so order doesn't matter).

So, using the demo.ledger file as an example, if I run `convert` on my
downloaded bank file, I'm going to get something like this:

2010/12/01 * Checking balance
Expenses:Unknown $1,000.00
Equity:Unknown

2010/12/20 * Organic Co-op
Expenses:Unknown $ 37.50 ; [=2011/01/01]
Equity:Unknown $ -225.00

2011/01/02 Grocery Store
Expenses:Unknown $ 65.00
Equity:Unknown

Would you just go through that and manually change all of those
categories in ledger-mode? I still like starting from the bank .csv,
as it's got transaction ids and the amounts already in there... all I
need to do is add categories. It appears that `convert` defaults to
the above. As this is the primary thing of interest to me, I was sort
of surprised that ledger mode offered no pop-up minibuffer to edit the
account, at least from perusing the manual page. I only see options
for reconciling, reports, changing an amount, etc.

In any case, `convert` got most of my stuff into ledger format and
ledger-mode at least recognizes the blocks, so I'll likely just start
from there. I still have *a lot* more reading to do... for example:

- I noticed in the demo file, the co-op (which I snipped above)
purchases were in one chunk vs. treated as separate transactions. I
wouldn't default to this and am guessing it's just a preference thing
(compared to having one transaction per payment)?

- I still wrestle with deposits and withdrawals. Am I the payee? Is my
bank? Does it matter as long as some assets category goes positive and
another negative?

- I'd love tracking checks *as we write them* vs. just waiting for
them to appear. This used to really annoy me in Moneydance, as I'd go
through the checkbook once a month to see what was written but not
come through. Then I'd have to have these little note entries along
the way to remind me what the total of uncashed checks were to-date so
that the sums added up. I bet there was a better way in Moneydance I'd
missed, and I'm positive there's one in ledger/beancount.

Anyway, still taking it slow but feel like I'm starting to get to a
usable noob state.

Thanks,
John

Michael Norrish

unread,

Feb 3, 2016, 11:30:25 PM2/3/16

to Ledger, Beancount

One way to do two-stage cheques would be something like

2016/1/25 * My Favourite Shop
Expenses:Groceries $100
Liabilities:Unprocessed Checks

2016/1/31 * Check clearing
Liabilities:Unprocessed Checks $100
Assets:Checking Account

You could assuredly add metadata to link the two transactions to be wrt some check #.

Michael

David Glasser

unread,

Feb 3, 2016, 11:41:38 PM2/3/16

to ledge...@googlegroups.com, Beancount

One downside to doing it this way is that before you enter the check clearing transaction, Assets:Checking does not actually answer the question "how much can I take out of my checking account without bouncing a check", which surely is a very important use case.

--dave

David Glasser

unread,

Feb 3, 2016, 11:50:11 PM2/3/16

to ledge...@googlegroups.com, Beancount

One downside to doing it this way is that before you enter the check
clearing transaction, Assets:Checking does not actually answer the question
"how much can I take out of my checking account without bouncing a check",
which surely is a very important use case.

--dave

On Feb 3, 2016 8:30 PM, "Michael Norrish" <michael...@gmail.com> wrote:

Martin Blais

unread,

Feb 4, 2016, 1:42:41 AM2/4/16

to Beancount, Ledger

On Wed, Feb 3, 2016 at 11:30 PM, Michael Norrish <michael...@gmail.com> wrote:

One way to do two-stage cheques would be something like

2016/1/25 * My Favourite Shop
Expenses:Groceries $100
Liabilities:Unprocessed Checks

2016/1/31 * Check clearing
Liabilities:Unprocessed Checks $100
Assets:Checking Account

You could assuredly add metadata to link the two transactions to be wrt some check #.

Yes, and I'm claiming that this linkage can probably be done by the computer.

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/A57D9595-9C57-4D73-9276-86815B04924F%40gmail.com.

Martin Blais

unread,

Feb 4, 2016, 1:46:35 AM2/4/16

to Beancount, Ledger

This is related to this:

https://bitbucket.org/blais/beancount/src/tip/src/python/beancount/plugins/tag_pending.py?at=default&fileviewer=file-view-default

See this thread for context:

https://groups.google.com/d/msg/beancount/z9sPboW4U3c/EQk25vKcHDQJ

Stefano Zacchiroli

unread,

Feb 4, 2016, 3:47:21 AM2/4/16

to bean...@googlegroups.com, ledger-cli

On Mon, Feb 01, 2016 at 11:41:03PM -0500, Martin Blais wrote:
> Check out LedgerHub for ideas.

[...]

> The project is being killed right now, rewritten much better and simpler
> and migrated into the Beancount project; if you do end up looking at the

[...]

> In any case, you can't really get away without writing at least some
> code--it's just not realistic, the inputs from different people vary too
> much. There's very little shared code out there (just basic codes for CSV
> files, like the ones you mention) but too few users that share the same
> accounts to generate the critical mass needed for reuse. A while back I
> created the LedgerHub project to host shared importer code and provide a
> framework for doing the above, but never received much contributions and
> honestly I didn't put the care and quality attention to it I should have.

I've the feeling that, right now, the lack of a generic framework ---
generic both on the front of data source (CSV, OFX, weird bank formats,
etc.) and on that of output formats (ledger, beancount, etc.) --- for
semi-automatically importing transactions is perhaps the most
significant limiting factor for the adoption of CLI accounting.

I went myself through the ad-hoc automation of my work-flow for
importing transactions from my bank, scripting together the web outside
of browsers suite [1] and icsv2ledger. It works decently enough for me,
but the CLI accounting community cannot really expect every newcomer to
go through that hacking process if it wants the community to flourish.

[1]: http://weboob.org/

Martin is right that the most front-end part of the import chain (web
scraping in most cases) will always remain a case-by-case business. But
there, communities like weboob can feel that niche quite nicely, if only
they will manage to grow and be diverse enough. (Right now that
community is very much skewed toward supporting French banks, with very
sparse support for other international banks.)

But the rest of the toolchain, from CSV down to your favorite CLI
accounting tool can really do better in terms of reference tools and
automation. I'm sorry I haven't had time/energy to contribute myself to
ledgerhub, because the design looked pretty solid; I'm looking forward
to the new rewrite :-)

</ramble>
--
Stefano Zacchiroli . . . . . . . za...@upsilon.cc . . . . o . . . o . o
Maître de conférences . . . . . http://upsilon.cc/zack . . . o . . . o o
Former Debian Project Leader . . . . . @zacchiro . . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »

Martin Blais

unread,

Feb 5, 2016, 12:58:10 AM2/5/16

to ledger-cli, Beancount, hle...@googlegroups.com

On Thu, Feb 4, 2016 at 12:41 PM, Simon Michael <si...@joyful.com> wrote:

Hi Stefano, I agree. Many things should be more straightforward and
better documented.

I think we could pick out a few common tasks to focus our
tool-building/documenting efforts on. Eg:

1. importing bank data and CSV generally. All of the tools and basic
generic workflows for this should be described on one page. Focus on
CSV, but we should mention OFX too (ledger-autosync is arguably best at
this with its download feature).

FWIW, there's a generic OFX converter in LedgerHub as well.

https://bitbucket.org/blais/ledgerhub/src/261b5c29ddc37fd44fe7bba83ad906e41aab47dc/lib/python/ledgerhub/ingest/importers/ofx.py?at=default&fileviewer=file-view-default

(I'm planning to keep that importer in the LedgerHub revamp as a default and example, and will likely include a configurable CSV importer as well.)

2. exporting all data and reports as CSV

3. moving data between the ledger-likes (ledger, hledger, beancount...).
Again, all tools and techniques gathered on one page. All existing
formats should be listed. The output of "ledger print" is a sort of
lowest common denominator, I propose we give it a name and decree that
every tool should import this as a basic interchange format. And/or a
standardised CSV representation of it, such as "hledger print -O csv"

Beancount has reports to convert from its syntax to Ledger and HLedger.

Defined here:

https://bitbucket.org/blais/beancount/src/d1d2ef0f1b6faf7bafdadc0f3a6ea20515df5ff5/src/python/beancount/reports/convert_reports.py?at=default&fileviewer=file-view-default

Converting from Beancount to Ledger in a non-lossy way is possible because of the nature of dated assertions, but not possible the other way around, you have to drop the file-based assertions.

Simon Michael

unread,

Feb 7, 2016, 11:44:23 AM2/7/16

to bean...@googlegroups.com, ledger-cli, hle...@googlegroups.com

We have a lot of docs, in various states of freshness, specific to each
implementation. Also many informative blog and mail list posts. Much of
this is hard to find.

Reading Stefano's recent ledger list post, I think, not for the first
time, wouldn't it be great if we had all of this linked somewhere
central, curated, and presented beautifully, providing an easy on-ramp
and reference for newcomers and experts ? Actively maintained by the
community ?

If so, where would that somewhere be ? ledger-cli.org and/or its wiki is
the closest existing candidate, but it has never felt right to load that
up with non-Ledger stuff. I think it's valuable for each implementation
to have its own distinct site. I think a separate, well-named, highly
findable site, even a single page collecting all useful links and acting
as a portal to the ledgerverse, could be a win.

If you agree, what would you call it ? Martin, since you are retiring
your LedgerHub tool, would that name be available ?

Related to naming.. what do we call this whole topic, anyway ? Stefano
used the phrase "command-line accounting". But we have curses and web
GUIs too. "Plain-text accounting" ? Pretty soon we'll probably support
some non-text storage format. "Ledger clones" ? Too narrow. Aside: in
conversation, I use "ledger-likes" for things similar to but not
necessarily compatible with Ledger (ledger, hledger beancount, abandon,
penny) and "*ledger" for very compatible ledger-likes (ledger, hledger).

Any thoughts ?

-Simon

On 2/4/16 9:41 AM, Simon Michael wrote:
> I think we could pick out a few common tasks to focus our
> tool-building/documenting efforts on. Eg:
>
> 1. importing bank data and CSV generally. All of the tools and basic
> generic workflows for this should be described on one page. Focus on
> CSV, but we should mention OFX too (ledger-autosync is arguably best at
> this with its download feature).
>

> 2. exporting all data and reports as CSV
>
> 3. moving data between the ledger-likes (ledger, hledger, beancount...).
> Again, all tools and techniques gathered on one page. All existing
> formats should be listed. The output of "ledger print" is a sort of
> lowest common denominator, I propose we give it a name and decree that
> every tool should import this as a basic interchange format. And/or a
> standardised CSV representation of it, such as "hledger print -O csv"
>

> 4. moving data from and to other accounting tools (gnucash, moneydance,
> excel, quick{en,books}, mobile account apps)
>
> 5. manual data entry. Editors and their modes, ledger entry, hledger add
> and other prompting tools, hledger-web, recurring entry scripts, etc.
>
> 6. a catalog of journal entries covering all common transactions
>

John Wiegley

unread,

Feb 7, 2016, 12:02:02 PM2/7/16

to Simon Michael, bean...@googlegroups.com, ledger-cli, hle...@googlegroups.com

>>>>> Simon Michael <si...@joyful.com> writes:

> If so, where would that somewhere be ? ledger-cli.org and/or its wiki is the
> closest existing candidate, but it has never felt right to load that up with
> non-Ledger stuff. I think it's valuable for each implementation to have its
> own distinct site. I think a separate, well-named, highly findable site,
> even a single page collecting all useful links and acting as a portal to the
> ledgerverse, could be a win.

I'd be happy for us to move everything Ledger-related to an "All things
Ledger-verse" Wiki, where we differentiate features when necessary among the
various implementations.

> Related to naming.. what do we call this whole topic, anyway ?

How about functional accounting tools?

--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2

John Hendy

unread,

Feb 7, 2016, 12:32:42 PM2/7/16

to ledge...@googlegroups.com, hle...@googlegroups.com, Beancount

On Feb 7, 2016 10:44 AM, "Simon Michael" <si...@joyful.com> wrote:

> If you agree, what would you call it ? Martin, since you are retiring
> your LedgerHub tool, would that name be available ?
>
>
> Related to naming.. what do we call this whole topic, anyway ? Stefano
> used the phrase "command-line accounting". But we have curses and web

For what it's worth, I'm super new and even though I knew what I was looking for... having the tool named the same as a common finance term makes it super tough to find relevant info.

Ever tried googling "ledger tutorial"? :)

In any case, my input would be to call this family of things something that existing users can find, as well as something that might make it on the forest page of Google results (if you'd like new people to find it as well).

John

> Any thoughts ?
>
> -Simon
>
> On 2/4/16 9:41 AM, Simon Michael wrote:
> > I think we could pick out a few common tasks to focus our
> > tool-building/documenting efforts on. Eg:
> >
> > 1. importing bank data and CSV generally. All of the tools and basic
> > generic workflows for this should be described on one page. Focus on
> > CSV, but we should mention OFX too (ledger-autosync is arguably best at
> > this with its download feature).
> >
> > 2. exporting all data and reports as CSV
> >
> > 3. moving data between the ledger-likes (ledger, hledger, beancount...).
> > Again, all tools and techniques gathered on one page. All existing
> > formats should be listed. The output of "ledger print" is a sort of
> > lowest common denominator, I propose we give it a name and decree that
> > every tool should import this as a basic interchange format. And/or a
> > standardised CSV representation of it, such as "hledger print -O csv"
> >
> > 4. moving data from and to other accounting tools (gnucash, moneydance,
> > excel, quick{en,books}, mobile account apps)
> >
> > 5. manual data entry. Editors and their modes, ledger entry, hledger add
> > and other prompting tools, hledger-web, recurring entry scripts, etc.
> >
> > 6. a catalog of journal entries covering all common transactions
> >
>

> --
>
> ---
> You received this message because you are subscribed to a topic in the Google Groups "Ledger" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/ledger-cli/u648SA1o-Ek/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to ledger-cli+...@googlegroups.com.

Stefano Zacchiroli

unread,

Feb 7, 2016, 1:54:20 PM2/7/16

to bean...@googlegroups.com, ledger-cli, hle...@googlegroups.com

On Sun, Feb 07, 2016 at 08:44:20AM -0800, Simon Michael wrote:
> Related to naming.. what do we call this whole topic, anyway ? Stefano
> used the phrase "command-line accounting". But we have curses and web
> GUIs too. "Plain-text accounting" ? Pretty soon we'll probably support
> some non-text storage format. "Ledger clones" ? Too narrow. Aside: in
> conversation, I use "ledger-likes" for things similar to but not
> necessarily compatible with Ledger (ledger, hledger beancount, abandon,
> penny) and "*ledger" for very compatible ledger-likes (ledger, hledger).
>
> Any thoughts ?

I believe I've first seen the usage of "CLI accounting" to refer
generically to the ledger/hledger/beancount/etc. community in some of
Martin Blais' design documents for beancount.

I don't think it's a very appropriate term either (for the same reasons
you mentioned), and IIRC I briefly discussed naming with Martin back
then, although only in private mail. I do agree with both Martin and you
that we do need a generic expression to characterize this amazing
community. So I just went along with Martin's lead in my previous mail,
for the sake of simplicity :-)

Now, if we're back to the drawing board (and hence opening the
bikeshedding gates!), I think the most accurate would indeed be "plain
text accounting". Plain text is the most distinguishing trait of what
we're collectively doing here. We might end up having *additional* non
text-based storage formats, but having only those will profoundly change
the target public and, hence, the community itself I believe.

Other traits (and in particular CLI approaches) are less defining.

Cheers.

Martin Blais

unread,

Feb 7, 2016, 2:00:12 PM2/7/16

to Beancount, ledger-cli, hle...@googlegroups.com

On Sun, Feb 7, 2016 at 11:44 AM, Simon Michael <si...@joyful.com> wrote:

We have a lot of docs, in various states of freshness, specific to each
implementation. Also many informative blog and mail list posts. Much of
this is hard to find.

Is it?

Reading Stefano's recent ledger list post, I think, not for the first
time, wouldn't it be great if we had all of this linked somewhere
central, curated, and presented beautifully, providing an easy on-ramp
and reference for newcomers and experts ? Actively maintained by the
community ?

I keep all relevant docs linked from a single place:

http://furius.ca/beancount/doc/index

Everything Beancount can be found there.

I think a few choice threads should be linked from somewhere too eventually.

If so, where would that somewhere be ? ledger-cli.org and/or its wiki is
the closest existing candidate, but it has never felt right to load that
up with non-Ledger stuff. I think it's valuable for each implementation
to have its own distinct site. I think a separate, well-named, highly
findable site, even a single page collecting all useful links and acting
as a portal to the ledgerverse, could be a win.

If you agree, what would you call it ? Martin, since you are retiring
your LedgerHub tool, would that name be available ?

It will be "available," - this just means I'll put a big fat notice on its homepage that it has been swallowed by Beancount - but I wouldn't recommend reusing it, that will just create more confusion, it's a bad idea IMHO.

Related to naming.. what do we call this whole topic, anyway ? Stefano
used the phrase "command-line accounting". But we have curses and web
GUIs too. "Plain-text accounting" ? Pretty soon we'll probably support
some non-text storage format. "Ledger clones" ? Too narrow. Aside: in
conversation, I use "ledger-likes" for things similar to but not
necessarily compatible with Ledger (ledger, hledger beancount, abandon,
penny) and "*ledger" for very compatible ledger-likes (ledger, hledger).

Command-line accounting is the most evocative IMO.

Any thoughts ?

-Simon

On 2/4/16 9:41 AM, Simon Michael wrote:
> I think we could pick out a few common tasks to focus our
> tool-building/documenting efforts on. Eg:
>
> 1. importing bank data and CSV generally. All of the tools and basic
> generic workflows for this should be described on one page. Focus on
> CSV, but we should mention OFX too (ledger-autosync is arguably best at
> this with its download feature).
>
> 2. exporting all data and reports as CSV
>
> 3. moving data between the ledger-likes (ledger, hledger, beancount...).
> Again, all tools and techniques gathered on one page. All existing
> formats should be listed. The output of "ledger print" is a sort of
> lowest common denominator, I propose we give it a name and decree that
> every tool should import this as a basic interchange format. And/or a
> standardised CSV representation of it, such as "hledger print -O csv"
>
> 4. moving data from and to other accounting tools (gnucash, moneydance,
> excel, quick{en,books}, mobile account apps)
>
> 5. manual data entry. Editors and their modes, ledger entry, hledger add
> and other prompting tools, hledger-web, recurring entry scripts, etc.
>
> 6. a catalog of journal entries covering all common transactions
>

--

You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/d0a1826e-2468-ff82-2984-9cc3afdbab1e%40joyful.com.

John Hendy

unread,

Feb 8, 2016, 9:08:34 AM2/8/16

to ledge...@googlegroups.com, Beancount, hle...@googlegroups.com

On Sun, Feb 7, 2016 at 12:59 PM, Martin Blais <bl...@furius.ca> wrote:
> On Sun, Feb 7, 2016 at 11:44 AM, Simon Michael <si...@joyful.com> wrote:
>>
>> We have a lot of docs, in various states of freshness, specific to each
>> implementation. Also many informative blog and mail list posts. Much of
>> this is hard to find.
>
>
> Is it?
>
>
>> Reading Stefano's recent ledger list post, I think, not for the first
>> time, wouldn't it be great if we had all of this linked somewhere
>> central, curated, and presented beautifully, providing an easy on-ramp
>> and reference for newcomers and experts ? Actively maintained by the
>> community ?
>
>
> I keep all relevant docs linked from a single place:
> http://furius.ca/beancount/doc/index
> Everything Beancount can be found there.
>
> I think a few choice threads should be linked from somewhere too eventually.

I'm not a huge fan of Google Docs, but I *do* quite like your index
and then having more topic-specific pages. Ledger's docs have the TOC,
but I've found navigating that monolith page somewhat unappealing. I'm
not really sure why as I can still search the page... just something
about being somewhere in the sea of all that text just isn't my
favorite. I rather like Org-mode's documentation:
- http://orgmode.org/manual/

It's got a TOC, and sections are further broken down. This also makes
it easy to link someone to a bite-sized page for help, e.g.:
- http://orgmode.org/manual/Motion.html#Motion

Then there's the user-contributed wiki, Worg:
- http://orgmode.org/worg/

There are more "official" pages on using various features, as well as
user-updated link stores for things like "outside" tutorials:
- http://orgmode.org/worg/org-tutorials/index.html

Anyone can contribute, and even in my ~1 week on this mailing list
I've already seen that there are clearly different ideas of how to
accomplish various tasks. For example, probably 5 folks suggested ways
to handle inter-account transfers (like checking <-> credit card).
These could be put in a user-contributed wiki so that the information
isn't purely stored in a mailing list. The relationship seems to be
that the manual has more syntax/setup information, while the wiki is
more for tutorials and practical usage. If ledger were re-written as
so, I think the "prose-y" stuff about double entry accounting might
live in the wiki, while things like `ledger -f file.dat [options]`
type stuff would be in the manual. Or valid file/transaction syntax,
internals, augmenting with scripts, etc.

Just some ideas from experience with another open source project!

John

> --
>
> ---

> You received this message because you are subscribed to a topic in the

> Google Groups "Ledger" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/ledger-cli/u648SA1o-Ek/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> ledger-cli+...@googlegroups.com.

Martin Blais

unread,

Feb 8, 2016, 9:42:13 AM2/8/16

to hle...@googlegroups.com, ledger-cli, Beancount

On Mon, Feb 8, 2016 at 9:08 AM, John Hendy <jw.h...@gmail.com> wrote:

On Sun, Feb 7, 2016 at 12:59 PM, Martin Blais <bl...@furius.ca> wrote:
> On Sun, Feb 7, 2016 at 11:44 AM, Simon Michael <si...@joyful.com> wrote:
>>
>> We have a lot of docs, in various states of freshness, specific to each
>> implementation. Also many informative blog and mail list posts. Much of
>> this is hard to find.
>
>
> Is it?
>
>
>> Reading Stefano's recent ledger list post, I think, not for the first
>> time, wouldn't it be great if we had all of this linked somewhere
>> central, curated, and presented beautifully, providing an easy on-ramp
>> and reference for newcomers and experts ? Actively maintained by the
>> community ?
>
>
> I keep all relevant docs linked from a single place:
> http://furius.ca/beancount/doc/index
> Everything Beancount can be found there.
>
> I think a few choice threads should be linked from somewhere too eventually.

I'm not a huge fan of Google Docs, but I *do* quite like your index
and then having more topic-specific pages.

What's wrong with Google Docs?

My experience with it is tremendously positive and enduring. The ability to receive comments and suggestions in-line by anyone has been a key contributing feature to increasing the quality and correctness of the text. And it looks terrific on all devices. I have instant access to all past revisions. I can download to various formats, including bake a PDF book. I can process them using the API. I wish it was easier to make the entire text conform to a particular style (separate style vs. contents) but that's mostly an author concern, not a reader. And in comparison to all my other projects, the number of user contributions to the docs has far, far exceeded that which I would have received had I written them in texinfo, ReST, markdown, LaTeX or any other format (and I'm no stranger to those).

What I've found is that people who don't like it do so mostly on dubious ideological grounds (the docs are open and downloadable, it's not a walled garden), and partly because it's highly unusual in the OSS community. Are you sure you're not just having an irrational response to something you're not used to seeing? Documentation is difficult to write, and most projects are vastly underdocumented. Google Docs makes it a _lot_ easier to write great docs, and to make corrections, together. It's perfect for open source projects, more should adopt it.

Have you seen my cookbook docs?

https://docs.google.com/document/d/1RaondTJCS_IUPBHFNdT8oqFKJjVJDsfsn6JEjBG04eA/edit#heading=h.ydun6pj2h2kq

--
You received this message because you are subscribed to the Google Groups "hledger" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hledger+u...@googlegroups.com.

John Wiegley

unread,

Feb 9, 2016, 8:01:50 PM2/9/16

to Martin Blais, hle...@googlegroups.com, ledge...@googlegroups.com, Beancount

>>>>> Martin Blais <bl...@furius.ca> writes:

> What's wrong with Google Docs?

I do not trust it stay around as long as Ledger will, and I wouldn't want to
have to scrape and convert all the text once it does disappear. Better to
pick an open, enduring format now and stick with it.

Martin Blais

unread,

Feb 9, 2016, 8:27:59 PM2/9/16

to John Wiegley, Martin Blais, hle...@googlegroups.com, ledger-cli, Beancount

On Tue, Feb 9, 2016 at 7:55 PM, John Wiegley <jo...@newartisans.com> wrote:

>>>>> Martin Blais <bl...@furius.ca> writes:

> What's wrong with Google Docs?

I do not trust it stay around as long as Ledger will, and I wouldn't want to
have to scrape and convert all the text once it does disappear. Better to
pick an open, enduring format now and stick with it.

You speak as if these things are equivalent. Google Docs vs. text-based formats isn't a fair comparison: By using texinfo, you're foregoing daily (yes, daily, and I'm not exaggerating) feedback, corrections and suggestions on your documentation and in-context interaction with your users about which parts are confusing and where they suggest improvements to it--and I respond to them. It's the best experience I've ever had writing user documentation. It's an experience. It's a workflow. It's something else. These aren't the same thing.

About openness: You should have nothing to worry about, here's the open, free API:

https://developers.google.com/drive/v2/reference/

Here's a script that uses it to automatically download all the docs:

https://bitbucket.org/blais/beancount/src/tip/src/python/beancount/docs/download_docs.py

You can also click on a folder to download a zipped version of the contents if you're the owner. Takes a few seconds.

You can download in OpenDocument and even .txt formats. Conversion, should it be required, should be a breeze.

Nothing to worry about.

And besides, do you really believe that Google would just pull the plug on Docs without a long and loud warning and an archival feature?

I find the reaction from people in the OSS community interestingly puzzling and somewhat curmudgeonly. I sort-of understand it in a way: for many problems, for years, sticking with Linux and simple solutions has proved superior to many commercial solutions to many problems, at least in the small scope. And most people deal within the scope of small and medium. And we've all experience the frustrating experience of working in some nasty commercial environments. And we're attached to those homemade solutions where we feel productive. But this is a case where I'm witnessing the OSS community unable to think outside the box. Docs is measurably better for shared collaborative editing than anything else out there. It's not the same thing at all.

Dominik Aumayr

unread,

Feb 10, 2016, 1:49:01 AM2/10/16

to bean...@googlegroups.com, John Wiegley, Martin Blais, hle...@googlegroups.com, ledger-cli

As a member of the beancount community I can see how Google Docs has huge benefits in the practical world:

- WYSIWYG
- Editable by everyone with an account
- Changesets
- Inline-comments
- It feels like a document, not a website (altough this is not always a benefit)

And I also get why people might want something else, having just started worked on a new starting-point for the beancount documentation (based on Sphinx; http://aumayr.github.io/beancount-docs-static/):

- Ideological reasons (Google is evil for [some|many] people in these communities)
- What if Google Docs goes away? Exporting the content is a feature, but then manually converting the content to fit a new system would be neccesary anyway.
- Navigation is a pain IMHO (missing sidebar-menus, etc.)
- No syntax highlighting or API documentation (the second reason, after navigation, why I started the efford mentioned above)

Especially the navigation-part is a showstopper for Google Docs IMHO. But, as mentioned, the benefits are really helpful for getting lots and lots of feedback, because it's so easy to edit.

Implementing a new documentation system with WYSIWYG, Changesets, an Inline-comment-feature, but also easy navigation, search and syntax highlighting is an option, too.
I'd be willing to work on it, if consensus is reached that this would be the best option.

> --
> You received this message because you are subscribed to the Google Groups "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
> To post to this group, send email to bean...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhOfsByeLmc-q7RYxVqTXgN1-vsuQcaSswmtS1MyWMOXEA%40mail.gmail.com.

Dominik Aumayr

unread,

Feb 10, 2016, 1:54:02 AM2/10/16

to Ledger, jo...@newartisans.com, bl...@furius.ca, hle...@googlegroups.com, bean...@googlegroups.com

As a member of the beancount community I can see how Google Docs has huge benefits in the practical world:

- WYSIWYG

- Editable by everyone with an account

- Changesets

- Inline-comments

- It feels like a document, not a website (altough this is not always a benefit)

And I also get why people might want something else, having just started worked on a new starting-point for the beancount documentation (based on Sphinx; http://aumayr.github.io/beancount-docs-static/):

- Ideological reasons (Google is evil for [some|many] people in these communities)

- What if Google Docs goes away? Exporting the content is a feature, but then manually converting the content to fit a new system would be neccesary anyway.

- Navigation is a pain IMHO (missing sidebar-menus, etc.)

- No syntax highlighting or API documentation (the second reason, after navigation, why I started the efford mentioned above)

Especially the navigation-part is a showstopper for Google Docs IMHO. But, as mentioned, the benefits are really helpful for getting lots and lots of feedback, because it's so easy to edit.

Implementing a new documentation system with WYSIWYG, Changesets, an Inline-comment-feature, but also easy navigation, search and syntax highlighting is an option, too.

I'd be willing to work on it, if consensus is reached that this would be the best option.

Stefano Zacchiroli

unread,

Feb 10, 2016, 2:29:47 AM2/10/16

to ledge...@googlegroups.com, hle...@googlegroups.com, Beancount

On Tue, Feb 09, 2016 at 08:27:38PM -0500, Martin Blais wrote:
> I find the reaction from people in the OSS community interestingly
> puzzling and somewhat curmudgeonly. I sort-of understand it in a way:
> for many problems, for years, sticking with Linux and simple solutions
> has proved superior to many commercial solutions to many problems, at
> least in the small scope. And most people deal within the scope of
> small and medium. And we've all experience the frustrating experience
> of working in some nasty commercial environments. And we're attached
> to those homemade solutions where we feel productive. But this is a
> case where I'm witnessing the OSS community unable to think outside
> the box. Docs is measurably better for shared collaborative editing
> than anything else out there. It's not the same thing at all.

You're overlooking at least the fact that using Google Docs forces
people to use non-free software, in the form of minified JavaScript
whose preferred form of modification is not available, onto any kind of
users including simple readers of documents hosted there. I personally
do not like being forced to use non-free software, especially when it is
run on my own computer.

You're free not to care about that. But I find weird that you find weird
that people in the free/open source software community might have a
problem with that. If not us, who?

If you've the ability to fix this, please do. If not, please at least
consider this as yet another argument against using Google Docs for
documentation targeted at the FOSS community.

Cheers.

Stefan Tunsch

unread,

Feb 10, 2016, 3:07:13 AM2/10/16

to 'Charles Lehner' via Ledger, hle...@googlegroups.com, bean...@googlegroups.com

Why not publish documentation on readthedocs.org?

It is a similar solution like github for code, but for documentation.

You can include your docs in your code repository written in Markdown or reStructuredText.

Readthedocs will pull it and compile it for people to read and search online.

On GitHub, you can even edit the docs online with it's online editor.

Just browse the site and have a look at the kind of projects hosting their documentation there.

Regards, Stefan Tunsch

--

---
You received this message because you are subscribed to the Google Groups "Ledger" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ledger-cli+...@googlegroups.com.

John Wiegley

unread,

Feb 10, 2016, 10:09:08 AM2/10/16

to Martin Blais, hle...@googlegroups.com, ledger-cli, Beancount

Martin,

I don't believe your assertion that conversion "should be a breeze". Google is
not free/libre software. And Docs is not a system I can use while offline and
while on a plane, unless I remembered to download the page ahead of time (for
browsers that support offline editing). Other presentation concerns were also
brought up.

Since this will be a community docs site, my vote isn't a veto, but I have
voiced my concerns.

John Hendy

unread,

Feb 10, 2016, 9:43:08 PM2/10/16

to ledge...@googlegroups.com, hle...@googlegroups.com, Beancount

On Mon, Feb 8, 2016 at 8:41 AM, Martin Blais <bl...@furius.ca> wrote:
> On Mon, Feb 8, 2016 at 9:08 AM, John Hendy <jw.h...@gmail.com> wrote:
>>
>> On Sun, Feb 7, 2016 at 12:59 PM, Martin Blais <bl...@furius.ca> wrote:
>> > On Sun, Feb 7, 2016 at 11:44 AM, Simon Michael <si...@joyful.com> wrote:
>> >>
>> >> We have a lot of docs, in various states of freshness, specific to each
>> >> implementation. Also many informative blog and mail list posts. Much of
>> >> this is hard to find.
>> >
>> >
>> > Is it?
>> >
>> >
>> >> Reading Stefano's recent ledger list post, I think, not for the first
>> >> time, wouldn't it be great if we had all of this linked somewhere
>> >> central, curated, and presented beautifully, providing an easy on-ramp
>> >> and reference for newcomers and experts ? Actively maintained by the
>> >> community ?
>> >
>> >
>> > I keep all relevant docs linked from a single place:
>> > http://furius.ca/beancount/doc/index
>> > Everything Beancount can be found there.
>> >
>> > I think a few choice threads should be linked from somewhere too
>> > eventually.
>>
>> I'm not a huge fan of Google Docs, but I *do* quite like your index
>> and then having more topic-specific pages.
>
>
> What's wrong with Google Docs?

Sorry for the delayed reply -- busy week!

> My experience with it is tremendously positive and enduring. The ability to
> receive comments and suggestions in-line by anyone has been a key
> contributing feature to increasing the quality and correctness of the text.
> And it looks terrific on all devices. I have instant access to all past
> revisions. I can download to various formats, including bake a PDF book. I
> can process them using the API. I wish it was easier to make the entire text
> conform to a particular style (separate style vs. contents) but that's
> mostly an author concern, not a reader. And in comparison to all my other
> projects, the number of user contributions to the docs has far, far exceeded
> that which I would have received had I written them in texinfo, ReST,
> markdown, LaTeX or any other format (and I'm no stranger to those).

Very good points, and ones I hadn't initially considered but
definitely buy. Org-mode's manual is in texi, and its wiki is in
Org-mode format, both managed via git. I admit that as simple as edits
are objectively, the simple act of pulling, changing, and pushing is
barrier enough that I don't act as often as I could. Some of it even
comes down to idiosyncrasies of texi that I didn't know (two spaces at
the end of every sentence... or is it line?). In any case, I can see
the user-contributed changes/suggestions thing for sure.

> What I've found is that people who don't like it do so mostly on dubious
> ideological grounds (the docs are open and downloadable, it's not a walled
> garden), and partly because it's highly unusual in the OSS community. Are
> you sure you're not just having an irrational response to something you're
> not used to seeing? Documentation is difficult to write, and most projects
> are vastly underdocumented. Google Docs makes it a _lot_ easier to write
> great docs, and to make corrections, together. It's perfect for open source
> projects, more should adopt it.

No, I'm not totally sure it's not just "different," and good question.
That said, some things come to mind.

- While I haven't used docs much for actual docs (usually just where
.doc[x] files people email me end up when downloaded on my phone),
I've used things like google drive and don't like the urls. Maybe
that's silly, but it's just cleaner. I just find something nice about
the link telling me *something* about the content. Compare:
--- Docs: https://docs.google.com/document/d/1dW2vIjaXVJAf9hr7GlZVe3fJOkM-MtlVjvCO1ZpNLmg/edit#heading=h.2ax1dztqboy7
--- Hypothetical: https://furius.ca/beancount/manual/beancount-vs-ledger

- Perhaps again silly, but I don't know why the documentation always
opens in editing/suggestion mode. I keep freaking out that I've bumped
a key and it's suggested something to you when I didn't even realize
it was doing so.

- Similar to links, Docs don't convey any sense of being related. I
guess I'm 3/3 of perhaps insignificant dislikes, but I just don't like
that there's no way to know two docs are related. Google Sites had
that (same domain).

- When I've used Drive to host some PDF and linked to it from my blog,
I *swear* the link changed if I changed the sharing settings on the
doc. Then I'd have to track down and update the gobbledy-gook links.

- On the subject of links, docs seems to assume the same thing you
think should not be assumed by ledger: user diligence/perfection. What
if you delete/move/rename a page? Docs assumes you will perfectly
track down every instance of that link an replace it with the new
one.[1] Other systems have automated ways of updating the TOC,
creating a hierarchy, etc. and the pages are all known so grep or sed
changes seem fairly straightforward. Your doc collection may be pretty
manageable at this point, but I can imagine a large number of
documents where link management would become pretty annoying.

- I'm coming from doing 90% of my document generation via Org-mode,
but I would rather type in markup and define systematic formatting
than constantly fiddle with italics, bold, numbering, etc. You already
hinted that this was a limitation of docs, so I don't think it's worth
saying more about.

- I really don't care that much about the OSS vs. not argument. I
don't pay for github repos... if they decided to shutdown, I'd be
stuck without hosting and have to move things. If Docs went down, it
sounds like there are ways to get your stuff, so I don't think that
argument is super strong. As long as the *format* isn't proprietary,
no big deal in my mind.

Anyway, I seriously meant that as a super in-passing comment (akin to
"I like Colgate more than Crest"). I really didn't mean to get into or
start a doc war! Wouldn't a community wiki have all of the advantages
of Google Docs as well as some further advantages others are
mentioning? It seems like what you like the most are:

- encouraging community edits (low barrier to a user suggesting/editing)
- edits digestible in some easy way (appear in-line, or similar)
- exportable to various formats
- revision tracking

Is that reasonably accurate?

Thanks!
John

[1] I wasn't aware of some of the features you mentioned in other
replies, so maybe it's easier than I know to quickly hunt down
changed/deleted links?

Martin Blais

unread,

Feb 12, 2016, 6:25:39 PM2/12/16

to Beancount, John Wiegley, Martin Blais, hle...@googlegroups.com, ledger-cli

On Wed, Feb 10, 2016 at 1:48 AM, Dominik Aumayr <dom...@aumayr.name> wrote:

As a member of the beancount community I can see how Google Docs has huge benefits in the practical world:

- WYSIWYG
- Editable by everyone with an account
- Changesets
- Inline-comments
- It feels like a document, not a website (altough this is not always a benefit)

And I also get why people might want something else, having just started worked on a new starting-point for the beancount documentation (based on Sphinx; http://aumayr.github.io/beancount-docs-static/):

- Ideological reasons (Google is evil for [some|many] people in these communities)

That's a very difficult argument to make.

- What if Google Docs goes away? Exporting the content is a feature, but then manually converting the content to fit a new system would be neccesary anyway.

- Navigation is a pain IMHO (missing sidebar-menus, etc.)

That could indeed be better, but other than the sidebar and the fact that's it looks familiar to Python users (which arguably aren't necessarily the typical Beancount user), and other than the sidebar, how is this:

http://aumayr.github.io/beancount-docs-static/users/index.html

substantially different or better than this?

furius.ca/beancount/doc/index

I'm not dissing on your effort BTW - I like quite like it and I'll integrate it - but rather I'm using it to point out that they're both pages with links and that they're similar. They look different, but they pretty much have the same content. I understand it doesn't look familiar, but ask yourself: is it just a matter of people having to get used to something new, or is there any real substance to that difference?

If you were focused exclusively on the rendering, you'd be overlooking the most important part of this: it's a dynamic and collaborative experience. It's a living process. It makes it trivially easy for anyone, with or without an account, with or without programming skills, to update the content and collaborate with me on it, in-context. To pop in a correction for a wrong number (so many corrections so far!). To reword a phrase poorly constructed (thank you readers!). The experience is substantially distinct and better than a wiki as well, especially if you have non-technical readers. Comments and suggestions are treated differently than content, and integrated with a notification system via email. There's no awkward syntax to learn. Even technical users have a difficult time keeping wikis up-to-date! I'm witnessing it daily, both in the OSS world and in the commercial sphere. Maintaining up-to-date docs is REALLY difficult for engineers. Here ... well you just make the edit right there in front of your eyes as you notice something that needs a change and an email is automatically sent to the author.

Now, a reader shouldn't care what technology I use to write the documentation. That's the author's problem. A reader should worry about whether the documentation is readable, well-written, understandable, usable, available where they need it to, indexed, cross-referenced, etc.

As a potential contributor, in theory there's the offline editing problem, but in practice, most of the docs are written when you're online anyhow. You can write the text in Emacs and import it later on when you reconnect if you're stuck on a flight or a train, as John mentions (how often does that happen anyway?). The actual typing is not what takes time at all--it's the thinking and the re-reading 20 times and re-writing and correcting and looking up definitions that takes time and effort. Cut-n-pasting from Emacs anywhere takes me one second. That's what I do for the few times I happen to write offline, and it's not a big hindrance.

But at the heart of this thread there is a subtle disconnect: I think most people on the list are focused on the ARTIFACT, and not on the PRODUCT. There is a similar problem in discussions around computer languages, where people are focused more on whether the language makes it easy for them to write the code over whether the language makes it easy to write "excellent software" without bugs; but... what matters is the product, not the code. No user will ever look at your app and say: "oh, that must be written in a really nice looking language."

Here, the task at hand is this: "TO PRODUCE GREAT DOCUMENTATION." That's the job description. The "artifact" is the tool that you use to write the software. It's markdown vs. reStructuredText vs. TexInfo vs. LaTeX. It's Emacs vs Vi vs WYSIWIG. The "product" is the final document that you read. It's the thing on your page. I maintain that what matters is the document; not the data format. No reader will ever look at crappy text and say "Oh, that's unclear, incomplete and confusing, but at least they made the effort to write it with LaTeX, that's really nice."

Part of the problem is that us nerds are attached to our tools. We've come to believe that our Open Source ways always provide the best outcome. The demise of Windows for much of the internet's piping has proven us right about clinging to some variant of good old UNIX for so many years. We're selfishly focused on our own experience making the docs, not on the resulting document. And in this instance our Open Source ways becomes just another kind of box that we need to think ourselves out of. I understand the other side of this argument, and the desire for writing it all in text files, I've been in love with that. I get the aesthetic. I'm trying to make a different point here, to break out of that mold, to make something new and hopefully better. And I've tried a different way, and my findings are that it's paying off tremendously, not in making my own experience better, but in producing a better result. I think the result speaks for itself.

So... what tool allows me (and you) to produce the best possible documentation? Make yourself open to all possibilities for a moment: What attributes of this tool are needed for us to write great documentation? At the top of the list is ease of publication and collaboration. Why? Because when it comes to writing, we're very very lazy and nothing beats clicking on the word right there and changing it in place and having the result published instantly, and integrating suggestions and corrections from readers with a single click. Whatever tool makes that process easiest is extremely valuable to an author.

Implementing a new documentation system with WYSIWYG, Changesets, an Inline-comment-feature, but also easy navigation, search and syntax highlighting is an option, too.
I'd be willing to work on it, if consensus is reached that this would be the best option.

Okay, just to be clear, I'm not saying that you have to use Docs for your own docs effort; I don't care about that. I'm not a big consensus seeker either (consensus yields average results). I'm saying that it's irrational for people to complain about Docs all the time - as it occurs time and time again either or my list, on private email or on other lists. It's a misguided criticism of a tool that provides a profound improvement to a process that is particularly difficult for us: WRITING. Docs is a really great tool for OSS projects. If you care about the result, you should want to use it.

Ben Finney

unread,

Feb 12, 2016, 7:55:07 PM2/12/16

to bean...@googlegroups.com, hle...@googlegroups.com, ledge...@googlegroups.com

Martin Blais <bl...@furius.ca> writes:

> On Wed, Feb 10, 2016 at 1:48 AM, Dominik Aumayr <dom...@aumayr.name> wrote:
>

> > And I also get why people might want something else […]:

> >
> > - Ideological reasons (Google is evil for [some|many] people in these
> > communities)
>
> That's a very difficult argument to make.

I don't agree with “evil”. A corporation is amoral, it doesn't have a
coherent personality and should never be regarded as a person.

What I do say is that it's unreasonable to pressure members of a
community to entrust their social and collaborative data – highly
valuable to the community – to a corporation that has no accountability
to that community.

This isn't ideology; it is a matter of trust. We have no good reason to
entrust the workflows that enable our collaboration, to Google.

> > - What if Google Docs goes away? Exporting the content is a feature,
> > but then manually converting the content to fit a new system would
> > be neccesary anyway.

Martin, you glide right past this important point Dominik makes without
addressing it at all. That smacks of dismissing the core complaints and
attacking a straw man.

The freedom of a project entails the freedom to fork, if necessary.
Vendor lock-in for workflow tools makes it much harder to ensure the
tools continue to serve us well.

> If you were focused exclusively on the rendering, you'd be overlooking
> the most important part of this: it's a dynamic and collaborative
> experience.

That collaboration is extremely important, as you point out. It is too
valuable for the community to be beholden to any particular corporation.

> But at the heart of this thread there is a subtle disconnect: I think
> most people on the list are focused on the ARTIFACT, and not on the
> PRODUCT.

There is a disconnect: you are dismissing important issues of freedom of
our collaboration tools and independence from any particular vendor of
the tools.

> Here, the task at hand is this: "TO PRODUCE GREAT DOCUMENTATION."

If the end product was all we cared about, then I don't see why any of
us would care about software freedom at all.

On the contrary, the resistance to vendor lock-in tools like Google Docs
is because we don't want any of our social, collaborative processes to
be mediated by a particular corporation.

--
\ “A hundred times every day I remind myself that […] I must |
`\ exert myself in order to give in the same measure as I have |
_o__) received and am still receiving” —Albert Einstein |
Ben Finney

Martin Blais

unread,

Feb 12, 2016, 11:22:18 PM2/12/16

to hle...@googlegroups.com, Beancount, ledger-cli

On Fri, Feb 12, 2016 at 7:49 PM, Ben Finney <ben+l...@benfinney.id.au> wrote:

Martin Blais <bl...@furius.ca> writes:

> On Wed, Feb 10, 2016 at 1:48 AM, Dominik Aumayr <dom...@aumayr.name> wrote:
>
> > And I also get why people might want something else […]:
> >
> > - Ideological reasons (Google is evil for [some|many] people in these
> > communities)
>
> That's a very difficult argument to make.

I don't agree with “evil”. A corporation is amoral, it doesn't have a
coherent personality and should never be regarded as a person.

What I do say is that it's unreasonable to pressure members of a
community to entrust their social and collaborative data – highly
valuable to the community – to a corporation that has no accountability
to that community.

This isn't ideology; it is a matter of trust. We have no good reason to
entrust the workflows that enable our collaboration, to Google.

You don't entrust it.

You use it.

Until you decide you don't.

> > - What if Google Docs goes away? Exporting the content is a feature,
> > but then manually converting the content to fit a new system would
> > be neccesary anyway.

Martin, you glide right past this important point Dominik makes without
addressing it at all. That smacks of dismissing the core complaints and
attacking a straw man.

The freedom of a project entails the freedom to fork, if necessary.
Vendor lock-in for workflow tools makes it much harder to ensure the

There's no vendor lock-in.

tools continue to serve us well.

> If you were focused exclusively on the rendering, you'd be overlooking
> the most important part of this: it's a dynamic and collaborative
> experience.

That collaboration is extremely important, as you point out. It is too
valuable for the community to be beholden to any particular corporation.

> But at the heart of this thread there is a subtle disconnect: I think
> most people on the list are focused on the ARTIFACT, and not on the
> PRODUCT.

There is a disconnect: you are dismissing important issues of freedom of
our collaboration tools and independence from any particular vendor of
the tools.

> Here, the task at hand is this: "TO PRODUCE GREAT DOCUMENTATION."

If the end product was all we cared about, then I don't see why any of
us would care about software freedom at all.

On the contrary, the resistance to vendor lock-in tools like Google Docs

There's no vendor lock-in.

I thought I already made that point:

You go to a doc or a folder full of docs. You right-click to "Download". 10 seconds later you have a zip file with all your docs, locally. You can export to .txt, RTF, OpenDocument, HTML and more. (Have you even tried it?) A hypothetical conversion from .txt to markdown of _all_ my docs would take no more than a disagreeable few hours of furious massaging of them using Emacs while listening to my favorite podcast. Once.

is because we don't want any of our social, collaborative processes to
be mediated by a particular corporation.

In the meantime you have a inexistant hypothetical "libre" wiki server which once you get consensus over who will take care of it will require ongoing maintenance by someone who will stop paying attention at some point and then it will rot out of date within three months. I've been there before, maintained a few of those myself. In the meantime I have an real, live ongoing conversation with my users in the documents which are constantly lighting up with feedback, buttressed by 100's of world-class engineers on their own cloud. For free.

But maybe you feel more free. And in case, I don't think I'm going to change your mind tonight.

And I have some code to write now.

Good luck with the politics,

Martin Blais

unread,

Feb 13, 2016, 12:30:56 AM2/13/16

to Martin Blais, hle...@googlegroups.com, Beancount, ledger-cli

On Fri, Feb 12, 2016 at 11:21 PM, Martin Blais <bl...@furius.ca> wrote:

On Fri, Feb 12, 2016 at 7:49 PM, Ben Finney <ben+l...@benfinney.id.au> wrote:
Martin Blais <bl...@furius.ca> writes:

> On Wed, Feb 10, 2016 at 1:48 AM, Dominik Aumayr <dom...@aumayr.name> wrote:
>
> > And I also get why people might want something else […]:
> >
> > - Ideological reasons (Google is evil for [some|many] people in these
> > communities)
>
> That's a very difficult argument to make.

I don't agree with “evil”. A corporation is amoral, it doesn't have a
coherent personality and should never be regarded as a person.

What I do say is that it's unreasonable to pressure members of a
community to entrust their social and collaborative data – highly
valuable to the community – to a corporation that has no accountability
to that community.

Finally: I'm not pressuring you to do anything. I'm just expressing to Dominik how I've found this tool to be a powerful enabler for my own free software project, which IMO is interesting by virtue of being unusual usage in the free software arena.

And, and just to be clear these views represent only my _personal_ views and pertain only to my open source project, and do not represent the views of any employers I've had, past or present. I think it's important to clarify with this disclaimer, in case you might have interpreted this otherwise.

is because we don't want any of our social, collaborative processes to
be mediated by a particular corporation.

In the meantime you have a inexistant hypothetical "libre" wiki server which once you get consensus over who will take care of it will require ongoing maintenance by someone who will stop paying attention at some point and then it will rot out of date within three months. I've been there before, maintained a few of those myself. In the meantime I have an real, live ongoing conversation with my users in the documents which are constantly lighting up with feedback, buttressed by 100's of world-class engineers on their own cloud. For free.

But maybe you feel more free. And in case, I don't think I'm going to change your mind tonight.

And I have some code to write now.
Good luck with the politics,

I'd like to apologize for this comment, perhaps it got a bit carried away. I understand that you and Stefano have stronger requirements than I do over the software you use, that's your right. I can appreciate this - I do select the GPL for a reason - but at the end of the day, we disagree on those values when it comes to usage; I'm a pragmatist and I'm very impatient, I'll use what gets me moving over insisting on everything being free. It's okay to disagree.

Cheers,

Dominik Aumayr

unread,

Feb 13, 2016, 4:51:32 AM2/13/16

to ledger-cli, Martin Blais, hle...@googlegroups.com, Beancount

Whups, it seems I poked a hornet's nest there.

- Ideological reasons (Google is evil for [some|many] people in these communities)

That's a very difficult argument to make.

I didn't want to make an argument (think "opinion") about that, but rather an observation (think "poll") that many OSS developers have a problem with Google.

That could indeed be better, but other than the sidebar and the fact that's it looks familiar to Python users (which arguably aren't necessarily the typical Beancount user), and other than the sidebar, how is this:
http://aumayr.github.io/beancount-docs-static/users/index.html
substantially different or better than this?
furius.ca/beancount/doc/index

The reason for making my own "Index"-document for the beancount-documentation is a quite different one: I wanted to have a browse-able version of the great documentation you did put into the source code, not ditch Google Docs because I don't like that it looks unfamiliar or something else. But now that I have my Sphinx-based page in front of me, the navigation and search feature _do_ make it a better product in those terms.
Honestly, editing it is a pain compared to Google Docs.

As for "people to complain about Docs all the time": I didn't complain about the beancount docs being in Google Docs. In fact, I pointed out how it has "huge benefits" and what features I would miss.

So... what tool allows me (and you) to produce the best possible documentation? Make yourself open to all possibilities for a moment: What attributes of this tool are needed for us to write great documentation?

The reason I wrote the email in the first place was having a discussion about these questions. And that's why I listed the pros and cons of Google Docs, as it was brought up as an option.

I'm saying that it's irrational for people to complain about Docs all the time - as it occurs time and time again either or my list, on private email or on other lists. It's a misguided criticism of a tool that provides a profound improvement to a process that is particularly difficult for us: WRITING. Docs is a really great tool for OSS projects. If you care about the result, you should want to use it.

Maybe "irrational" to complain about the _benacount_ documentation to live in Google Docs, as this is already decided upon and maintained by you.

But saying that "if we care about the results, we should want to use it" goes further: This effort here is at the start - no word of documentation written yet - and we discuss about which ways to go and which tools to use, "to produce great documentation". Discussing the virtues and drawbacks of any of the available tools should not cross a red line for anyone and be called "irrational".

I do wish to reiterate what I already said in my last email: The virtues of WYSIWYG, editable by anyone and inline-comments are emense and I honestly think that Martin got many useful suggestions and content fixes out of that, as he stated previously. This should not be overlooked and dismissed just because of the name "Google" in it.

Google Docs also has drawbacks (feature-wise and, for some people, ideologically) and as a software developer I propose to come up with our own system, taking all the great features and learnings from Google Docs and fixing it's drawbacks (which are great features of other tools out there) our own.

--

---
You received this message because you are subscribed to a topic in the Google Groups "Ledger" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ledger-cli/u648SA1o-Ek/unsubscribe.

To unsubscribe from this group and all its topics, send an email to ledger-cli+...@googlegroups.com.

Daniel Clemente

unread,

Feb 13, 2016, 5:01:00 PM2/13/16

to bean...@googlegroups.com, hle...@googlegroups.com, ledge...@googlegroups.com

El Sat, 13 Feb 2016 11:49:29 +1100 Ben Finney va escriure:

>
> Martin Blais <bl...@furius.ca> writes:
>
> > On Wed, Feb 10, 2016 at 1:48 AM, Dominik Aumayr <dom...@aumayr.name> wrote:
> >
> > > And I also get why people might want something else […]:
> > >
> > > - Ideological reasons (Google is evil for [some|many] people in these
> > > communities)
> >
> > That's a very difficult argument to make.
>
> I don't agree with “evil”. A corporation is amoral, it doesn't have a
> coherent personality and should never be regarded as a person.
>

Off-topic… Maybe it shouldn't be regarded as a person, but that's what happens: https://en.wikipedia.org/wiki/Corporate_personhood
That's a central plot in a documentary called „The corporation“ and linked from that page.

Back to Google Docs:

I stopped reading beancount docs after they were removed from the repository. The idea of collaborative editing in Google Docs is good, but in my case the implementation isn't usable (for many practical reasons) and the files always are too „far away“ from me, hidden through apps and APIs and libraries and JS and clouds.

I'd prefer if beancount were tool-neutral instead of discussing which is the best tool.

--
Daniel

Vikas Rawal

unread,

Feb 13, 2016, 10:28:10 PM2/13/16

to ledge...@googlegroups.com, bean...@googlegroups.com, hle...@googlegroups.com

>
> I stopped reading beancount docs after they were removed from the repository. The idea of collaborative editing in Google Docs is good, but in my case the implementation isn't usable (for many practical reasons) and the files always are too „far away“ from me, hidden through apps and APIs and libraries and JS and clouds.

Seeing the docs on Google Drive put me off. It is so inconvenient to read them online. The documents are formatted as if for casual printing. They are neither meant to be read on-screen, nor professional enough for serious printing.

While I could see that beancount had a lot to offer, having docs on Google puts me off.

Vikas

Jacques Gagnon

unread,

Feb 13, 2016, 10:44:34 PM2/13/16

to Beancount, ledge...@googlegroups.com, hle...@googlegroups.com, vikas...@gmail.com

I guess to each his own priority.

Back in October i was debating between starting using either Ledger or Hledger.

Somehow come across Beancount in the process and when I saw the all docs i was sold to it.

Finally a project with proper design spec, documentation and feedback loop.

While gdocs isn't the nicest platform for reading, I'm happy it enabled this amount of information to exist in the first place.

My 2 cents...

John Wiegley

unread,

Feb 14, 2016, 4:50:31 AM2/14/16

to Martin Blais, Beancount, ledge...@googlegroups.com, hle...@googlegroups.com

>>>>> Martin Blais <bl...@furius.ca> writes:

> You can write the text in Emacs and import it later on when you reconnect if
> you're stuck on a flight or a train, as John mentions (how often does that
> happen anyway?).

"How often does that happen anyway?" Often. What's comfortable for your use
case is not the same as what everyone else is doing or experiencing.

I prefer to have docs maintained under Git, offline accessible, in an open
format that Emacs is able to edit. That could be Markdown, LaTeX, TeXinfo, or
any of the other free formats available. I can't think of even one advantage
Google Docs has to offer me, given the way I work on software projects: It's
not offline accessible, it uses its own UI, it puts me in the browser for
editing, and I can't use Git to examine history. I might as well be editing a
Microsoft Word document in a network mounted folder, as work sometimes makes
me do.

And all for what, because you say it will attract more contributors to the
documentation effort? I wouldn't take it on your word against those negatives,
while also losing the positives (for me) of the Markdown/Texinfo approach.

Chris Bennett

unread,

Feb 14, 2016, 7:37:38 AM2/14/16

to ledge...@googlegroups.com, Martin Blais, hle...@googlegroups.com, Beancount

> I find the reaction from people in the OSS community interestingly puzzling
> and somewhat curmudgeonly.

[..]

> But this is a case where I'm witnessing the OSS community unable to
> think outside the box.

How apt that this was circling a few days ago:
http://taskwarrior.org/docs/advice.html

Particularly:
"People will pick a fight with you about all your incidental choices.
Your issue tracker, your branching strategy, your version numbers, the
text editor you use, and so on."

I strongly prefer open solutions, terminal & text applications, but I
don't care that another open source developer has made their own
choice, and can understand the reasons for Martin preferring Google
Docs (documentation rots, so anything that provides the lowest barrier
of entry for end users to submit feedback, collaborate or contribute
new content is a good thing).

Regards,

Chris

Martin Blais

unread,

Feb 14, 2016, 6:00:18 PM2/14/16

to hle...@googlegroups.com, Martin Blais, Beancount, ledger-cli

On Sun, Feb 14, 2016 at 4:50 AM, John Wiegley <jo...@newartisans.com> wrote:

>>>>> Martin Blais <bl...@furius.ca> writes:

> You can write the text in Emacs and import it later on when you reconnect if
> you're stuck on a flight or a train, as John mentions (how often does that
> happen anyway?).

"How often does that happen anyway?" Often. What's comfortable for your use
case is not the same as what everyone else is doing or experiencing.

I prefer to have docs maintained under Git, offline accessible, in an open
format that Emacs is able to edit. That could be Markdown, LaTeX, TeXinfo, or
any of the other free formats available. I can't think of even one advantage
Google Docs has to offer me, given the way I work on software projects: It's
not offline accessible, it uses its own UI, it puts me in the browser for
editing, and I can't use Git to examine history. I might as well be editing a
Microsoft Word document in a network mounted folder, as work sometimes makes
me do.

The comments you make in this paragraph are all about _your_ experience, as the author of the text. Did you even read my message? (The part where I'm reframing R. Hickey's argument from "Simple made easy"?)

And all for what, because you say it will attract more contributors to the
documentation effort? I wouldn't take it on your word against those negatives,
while also losing the positives (for me) of the Markdown/Texinfo approach.

Not "will", but "has". I'm not reporting a hypothetical from the future, I'm reporting actual, lived experience from the past. Witness it for yourself: Access any of my docs (e.g. http://furius.ca/beancount/doc/syntax) and click on the "Comments" button on the top-right to view a full history of accepted and rejected suggestions and their accompanying micro-threads, you'll see the activity there. Check it out, don't trust me. On the whole set of documents I get suggestions and fixes _every single day_. Don't you wish you had that level of interaction with your users?

Zack Williams

unread,

Feb 15, 2016, 11:14:38 AM2/15/16

to ledge...@googlegroups.com, Martin Blais, Beancount, hle...@googlegroups.com

On Sun, Feb 14, 2016 at 2:50 AM, John Wiegley <jo...@newartisans.com> wrote:
> I prefer to have docs maintained under Git, offline accessible, in an open
> format that Emacs is able to edit. That could be Markdown, LaTeX, TeXinfo, or
> any of the other free formats available.

I agree with this - the offline ability, combined with history and the
ability to grep for what you want is hugely important to me.

You can also run this kind of docs through a static site generator
(Jekyll or Hugo work great for this, possibly with a run through
Pandoc as needed), and generate very nicely structured, web-based
documentation sites.

> I can't think of even one advantage
> Google Docs has to offer me, given the way I work on software projects: It's
> not offline accessible, it uses its own UI, it puts me in the browser for
> editing, and I can't use Git to examine history. I might as well be editing a
> Microsoft Word document in a network mounted folder, as work sometimes makes
> me do.

Agreed. I would also add that Google Docs, and pretty much all word
processors are designed for <10 page unstructured documents, and
quickly fall down on any larger, structured documentation work.

I've found this Google Docs to Markdown conversion script to be useful
in the past: https://github.com/mangini/gdocs2md

- Zack

Zack Williams

unread,

Feb 15, 2016, 11:18:59 AM2/15/16

to ledge...@googlegroups.com, hle...@googlegroups.com, Martin Blais, Beancount

On Sun, Feb 14, 2016 at 3:59 PM, Martin Blais <bl...@furius.ca> wrote:
> Not "will", but "has". I'm not reporting a hypothetical from the future, I'm
> reporting actual, lived experience from the past. Witness it for yourself:
> Access any of my docs (e.g. http://furius.ca/beancount/doc/syntax) and click
> on the "Comments" button on the top-right to view a full history of accepted
> and rejected suggestions and their accompanying micro-threads, you'll see
> the activity there. Check it out, don't trust me. On the whole set of
> documents I get suggestions and fixes _every single day_. Don't you wish you
> had that level of interaction with your users?

I don't doubt that this is hugely helpful - if there was a way to
integrate it in an offline-friendly workflow, it would be even better.

- Zack

John Wiegley

unread,

Feb 15, 2016, 2:08:59 PM2/15/16

to Jacques Gagnon, Beancount, ledge...@googlegroups.com, hle...@googlegroups.com, vikas...@gmail.com

>>>>> Jacques Gagnon <darth...@gmail.com> writes:

> Finally a project with proper design spec, documentation and feedback loop.
> While gdocs isn't the nicest platform for reading, I'm happy it enabled this
> amount of information to exist in the first place.

To be fair, it's not really Google Docs that enabled it, but Martin's amazing
capacity for going into detail in this design space. I credit him as a human
being, and not the technology, for why beancount is so well and thoroughly
documented.

Stefano Zacchiroli

unread,

Feb 15, 2016, 2:32:11 PM2/15/16

to Beancount, ledge...@googlegroups.com, hle...@googlegroups.com

On Mon, Feb 15, 2016 at 11:06:39AM -0800, John Wiegley wrote:
> To be fair, it's not really Google Docs that enabled it, but Martin's amazing
> capacity for going into detail in this design space. I credit him as a human
> being, and not the technology, for why beancount is so well and thoroughly
> documented.

Yeah, absolutely, thanks for speaking my mind.

That said, I've no doubts myself that the ease of commenting/suggesting
changes that Google Docs offer played a role in the amount of
contribution Martin got. But it is very hard to factor out the initial
quality of the doc to properly measure the alleged "competitive
advantage" offered by Google Docs.

true...@gmail.com

unread,

Feb 17, 2016, 3:01:24 PM2/17/16

to Beancount, ledge...@googlegroups.com

Hi,

this looks like a great project. I am using Gnucash, but am unhappy about some things. Are you still looking for .csv files to test on ? Would I send them to you directly?

Best wishes

Kai

Am Dienstag, 2. Februar 2016 05:41:25 UTC+1 schrieb Martin Blais:

On Mon, Feb 1, 2016 at 1:13 PM, John Hendy <jw.h...@gmail.com> wrote:
Greetings,

It's a fresh year and I've been seeing ledger come up on the Org-mode
mailing list for some time and decided to give it a try. I'm coming
from Moneydance and just wanted to get away from the tedious GUI
method of adding information, as well as have flexibility to generate
my own reports/visualizations with python or R, etc. [1]

Consider that I'm about a week into reading through docs here and
there during evenings. My first step was going to be importing a
downloaded .csv from my bank to get started. I'm still trying to
verify I get the terminology, so I'll use this from the manual:

From 5.1 Basic format:
```
This transaction has a date, a payee or description, a target account
(the first posting), and a source account (the second posting). Each
posting specifies what action is taken related to that account.
```

From 7.2.1.2 The convert command:
```
The fields ledger can recognize contain these case-insensitive strings
date, posted, code, payee or desc or description, amount, cost,total,
and note.
```

For my purposes, I import my finances primarily to "categorize" (what
I believe here is called adding an account) and assign a payee so that
I can track my spending against a budget. So, I'm surprised there's no
special column keyword I can add for "account". It appears that all I
can do is pass, say, `--account "assets:checking"` to have ledger know
it's against assets:checking? Is that correct?

From trying to google "import csv account ledger" or similar
variations, I've been surprised that the only tools to do something
like this appear to be interactive one-trans-at-a-time programs like
icsv2ledger and reckon (granted, they can learn or follow rules). I
could quickly go through my bank's .csv and add exp:food:dining,
exp:auto:fuel to my ~100 transactions a month and have those imported
just like the other column data.

Keep in mind that part of the process of importing (they like call it "reconciling") involves
- Manually reviewing the transactions for correctness or fraud
- Merging new transactions with previous transactions imported from the other side (e.g. a payment from a bank account to pay off on'es credit card will typically be imported from both the bank AND credit card accounts; you must merge the corresponding transactions together)
- Assigning the right category (you can automate this with a script I suppose; frankly it's not much work, I do all of mine manually with the help of auto-completion from Emacs, which is the most important feature IMO)
- Moving the resulting transactions to the right place in your file.
- Verifying balances visually, or inserting a balance directive which asserts what the final account balance should be (for correctness) after the new transactions.

If you do it often enough and you have editing chops, you get used to the dance and it's a breeze.
I think the fourth step can be hypothetically solved using heuristics.

I feel like I must be missing something with respect to getting the
from/to accounts added to the bank data.

Perhaps to take a step back...
- are the majority of folks writing their transactions by hand in ledger format?

Can't say about others, but for me I want to say that about half the importing is semi-automatic.
- Credit cards and banks import from downloads but I need to categorize manually (as described above), fairly good quality downloads.
- Investment accounts fully automated buys but I need to manually edit sales in some accounts. Great quality of downloads.
- Payroll stubs and vesting and a few other things are provided only as PDFs and I don't bother trying to extract (though I've made some headway towards this, it's incomplete; it turns out fully automating table extraction from PDF isn't trivial. The best OSS solution is TabulaPDF by far but you still need to manually identify where the table is).
- Cash transactions: I have to enter those by hand. I only book non-food expenses as individual transactions directly, and for food maybe once every six months I'll count my wallet balance and insert one transaction per month to debit away the cash account toward food. If you do this, you end up with surprisingly little transactions to book manually, maybe a few/week. I suppose it could depend on lifestyle choices.

It takes me less than 1 hour/week to run through the active accounts, usually first thing Saturday morning when I get up. Most of the pain is logging with user/passwords into the various institutions and clicking the right buttons to generate the downloaded files. Extraction and filing is automated using importers I wrote against LedgerHub. Less active accounts are updated every quarter or when I feel like it.

- is there some better way to import bulk data (e.g. via ledger's
convert function) and post-edit once it's in ledger format? It seemed
a .csv in LO calc was pretty convenient vs. scrolling through a long
text file
- any other pointers along the above lines would be most welcome.

Check out LedgerHub for ideas.

Original design doc:
http://furius.ca/ledgerhub/doc/design

Post-mortem:
http://furius.ca/ledgerhub/doc/postmortem

The project is being killed right now, rewritten much better and simpler and migrated into the Beancount project; if you do end up looking at the code make sure you're checking out the "stable" branch, it's a bit of a riot on the default branch right now, it will be broken.

Essentially, I'm defining a config (in Python) as a list of "importer" objects and boil the process down to three steps:
1. Identify: Given a messy list of downloaded files (e.g. in ~/Downloads), automatically identify which importer is supposed to handle them
2. Extract: Extracting transactions and statement date from each file, if possble
3. File: Filing away the downloads to a directory hierarchy which mirrors the chart of accounts, for preservation, e.g. in a personal git repo.

You could think of adding
0. Fetch: Automatically download the files
but that's too hard. Personally I just don't have the stamina to implement this for myself. Given the nature of today's websites and the castles of JavaScript used to implement them, this would be a nightmare to implement for too little payoff. I love the idea of full automation, but I just don't have the time. Note that if you don't mind the nature of their business (they sell your data), you could potentially try to use Yodlee to pull much of it from a single place.

In any case, you can't really get away without writing at least some code--it's just not realistic, the inputs from different people vary too much. There's very little shared code out there (just basic codes for CSV files, like the ones you mention) but too few users that share the same accounts to generate the critical mass needed for reuse. A while back I created the LedgerHub project to host shared importer code and provide a framework for doing the above, but never received much contributions and honestly I didn't put the care and quality attention to it I should have. More importantly, regression testing for those importers is most easily carried out using actual downloaded files compared to a corresponding expected output, but these files don't share well (they contain lots of personal data) so one ends up with two repositories anyhow. And besides there are several design decisions in some importers that may not please every user, in particular about how you choose your accounts for investments (there are degrees of freedom), so even sharing is not entirely an obvious win.

By the way, I've found that regression testing is the _key_ to maintaining your importer code, because those importers are often written against file formats with no official spec and unexpected surprises show up routinely (e.g. I have XML files with some unescaped "&" characters, which require a custom fix "just for that bank", for instance, lots of nasty surprises), so you really need to be able to reproduce your tests. I think I have to make at least _some_ fix to an importer about once/month, and that sinks maybe a half-hour (involves adding the new file which makes it break, fix the importer code, and potentially update the older expected files for changes).

I hope this helps give some color to the process,

I tried to search the list for more of this sort of question, so
forgive me if I've missed something. Replying with links pointing me
in the right direction would be plenty sufficient if this has already
been discussed!

Thanks!
John

[1] http://moneydance.com/

--

---
You received this message because you are subscribed to the Google Groups "Ledger" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ledger-cli+...@googlegroups.com.

Martin Blais

unread,

Feb 18, 2016, 12:25:26 AM2/18/16

to Beancount

-ledger-cli

CSV files to test on is always useful. I want to write a great example CSV parser for the LedgerHub rewrite.

If you'd like to share some data, best is if you can anonymize (e.g. mark with XXXXX) any account numbers or other PII, and if the files are really long AND regular, you can also truncate the end to minimize the amount of personal details.

Send to my personal address if you prefer not to share on the list (bl...@furius.ca).

--
You received this message because you are subscribed to the Google Groups "Beancount" group.

To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/2e5eecdc-7ab6-4892-a797-d9f3f9a3d048%40googlegroups.com.

Martin Blais

unread,

Feb 18, 2016, 12:25:39 AM2/18/16

to Martin Blais, Beancount

Forgot to say "Thanks!"

Kai Truempler

unread,

Feb 18, 2016, 2:15:54 PM2/18/16

to bean...@googlegroups.com

Dear Martin,

please find the files attached, They are insufficiently anonymized, so please don't use them for tutorials and such as they are. I changed just some account numbers,so for example the inconsistency in the decimal divider with point and comma in the same file is from the original download.

I attached files from two credit cards and two checking accounts. One checking account is also in the MT940 format that some banks offer. Let me know if there is a need for translation.

For me personally to switch systems (from a user point of view), online banking integration (HBCI) is important, as almost all German banks offer that for at least their checking accounts.

Do you think aqbanking or something similar might be usefully integrated/connected to beancount?

Also, maybe this is of interest for you: https://github.com/hoffie/dkb-visa.

Best wishes

Kai

Forgot to say "Thanks!"

You received this message because you are subscribed to a topic in the Google Groups "Beancount" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/beancount/WJI9Aa67mxE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to beancount+...@googlegroups.com.

To post to this group, send email to bean...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhMbur2_OvcdyTVpwR--zdPEE3FTy6zJH2uv0TO-GpGf_g%40mail.gmail.com.

20160218-123456789-umsMT940.TXT

4998________1234.csv

12000001.csv

20160218-123456789-umsatz.csv

umsatz-4917________1234-20160218.csv

Martin Blais

unread,

Feb 19, 2016, 1:02:37 AM2/19/16

to Beancount

On Thu, Feb 18, 2016 at 2:15 PM, Kai Truempler <true...@gmail.com> wrote:

Dear Martin,

please find the files attached, They are insufficiently anonymized, so please don't use them for tutorials and such as they are. I changed just some account numbers,so for example the inconsistency in the decimal divider with point and comma in the same file is from the original download.

I attached files from two credit cards and two checking accounts. One checking account is also in the MT940 format that some banks offer. Let me know if there is a need for translation.

Thanks for sharing this with me. I'll save those and see if I can drive some auto-detection from the example csv importer I'll write for the new ledgerhub version, and in any case make sure it can handle those.

For me personally to switch systems (from a user point of view), online banking integration (HBCI) is important, as almost all German banks offer that for at least their checking accounts.

Do you think aqbanking or something similar might be usefully integrated/connected to beancount?

I think it would be possible, but given the nature of the problem, I doubt there's much generic functionality to provide beyond a custom script that would fetch the data. I think it's beyond the scope of Beancount itself to include all possible fetching methods, since there's a lack of a widespread standard.

To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CANSsqkSf23sOktQk4_gkLHh-optAiUg29WtQD6V9cvVFR%2ByJHw%40mail.gmail.com.

Reply all

Reply to author

Forward