Convert CSV Bank Statement to ledger

2,276 views
Skip to first unread message

Esben Stien

unread,
May 29, 2014, 5:18:59 PM5/29/14
to ledge...@googlegroups.com

I have the following bank statement from my bank:

"03.11.2013";"Foo";"04.11.2013";"1.638,00";""
"03.11.2013";"Bar";"04.11.2013";"";"492,93"

The fields are:

"Date";"Description";"Interest Date";"Out of Account";"In to Account"

, so the ins and outs are separated, the date format is fubar, the
separator is semicolon and the amount format is just as fubar.

I could script it into something more sane, but I wonder if

https://github.com/bhutley/bank-csv-to-ledger

..or another import package can work directly with this format?

--
Esben Stien is b0ef@e s a
http://www. s t n m
irc://irc. b - i . e/%23contact
sip:b0ef@ e e
jid:b0ef@ n n

Peter Ross

unread,
May 29, 2014, 5:54:24 PM5/29/14
to ledger-cli
On 30 May 2014 07:18, Esben Stien <b0...@esben-stien.name> wrote:
>
> I have the following bank statement from my bank:
>
> "03.11.2013";"Foo";"04.11.2013";"1.638,00";""
> "03.11.2013";"Bar";"04.11.2013";"";"492,93"
>
> The fields are:
>
> "Date";"Description";"Interest Date";"Out of Account";"In to Account"
>
> , so the ins and outs are separated, the date format is fubar, the
> separator is semicolon and the amount format is just as fubar.
>
> I could script it into something more sane, but I wonder if
>
> https://github.com/bhutley/bank-csv-to-ledger
>
> ..or another import package can work directly with this format?
>
Try

https://github.com/quentinsf/icsv2ledger

--csv-date-format STR for the date format, see
https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior
for how to format that string

--delimiter ;

--debit INT to state which column is for debits

--credit INT to state which column is for credits

--ledger-decimal-comma that it uses , rather . for the decimal separator

Jostein Berntsen

unread,
May 30, 2014, 3:55:00 AM5/30/14
to ledge...@googlegroups.com
On 29.05.14,23:18, Esben Stien wrote:
>
> I have the following bank statement from my bank:
>
> "03.11.2013";"Foo";"04.11.2013";"1.638,00";""
> "03.11.2013";"Bar";"04.11.2013";"";"492,93"
>
> The fields are:
>
> "Date";"Description";"Interest Date";"Out of Account";"In to Account"
>
> , so the ins and outs are separated, the date format is fubar, the
> separator is semicolon and the amount format is just as fubar.
>
> I could script it into something more sane, but I wonder if
>
> https://github.com/bhutley/bank-csv-to-ledger
>
> ..or another import package can work directly with this format?
>

I like to use reckon:

https://github.com/cantino/reckon

How to use:

http://blog.andrewcantino.com/blog/2010/11/06/command-line-accounting-with-ledger-and-reckon/


Jostein

Edwin van Leeuwen

unread,
May 30, 2014, 4:19:25 AM5/30/14
to ledge...@googlegroups.com
Reckon should parse your csv file correctly. If not please file a bug so
we can improve Reckon :)

Cheers, Edwin

--

Jostein Berntsen

unread,
May 30, 2014, 9:52:32 AM5/30/14
to ledge...@googlegroups.com
Thanks, Edwin. :) Here is an example on a reckon command I use for my bank
csv fle:

eckon -f drang2.csv -v --ignore-columns 2 --csv-separator ';' --currency NOK
--suffixed  --comma-separates-cents -l jbfinans.dat -o jbfinans.dat

Jostein


Jostein Berntsen

unread,
May 30, 2014, 9:55:18 AM5/30/14
to ledge...@googlegroups.com
Sorry, it should be this command for my bank csv file:

reckon -f drang2.csv -v --ignore-columns 2 --csv-separator ';' --currency NOK

Jostein Berntsen

unread,
Jun 2, 2014, 6:43:34 AM6/2/14
to ledge...@googlegroups.com

Stefano Zacchiroli

unread,
Jun 10, 2014, 5:31:17 AM6/10/14
to ledge...@googlegroups.com
On Fri, May 30, 2014 at 09:54:39AM +0200, Jostein Berntsen wrote:
> I like to use reckon:

Has anyone here used both reckon and icsv2ledger, and fancy posting a
brief comparison of the two?

I've just integrated icsv2ledger in my accounting work-flow. I've
managed to make it do what I want, but I've also found it a bit rough
around the edges --- I can elaborate more if people on this list are
interested.

If others have found reckon to be more flexible than icsv2ledger, I'll
be happy to reconsider my tool choice.

TIA,
Cheers.
--
Stefano Zacchiroli . . . . . . . za...@upsilon.cc . . . . o . . . o . o
Maître de conférences . . . . . http://upsilon.cc/zack . . . o . . . o o
Former Debian Project Leader . . @zack on identi.ca . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »

Edwin van Leeuwen

unread,
Jun 10, 2014, 5:53:26 AM6/10/14
to ledge...@googlegroups.com
On Tue, 10 Jun 2014 11:31:14 +0200, Stefano Zacchiroli <za...@upsilon.cc> wrote:
> On Fri, May 30, 2014 at 09:54:39AM +0200, Jostein Berntsen wrote:
> > I like to use reckon:
>
> Has anyone here used both reckon and icsv2ledger, and fancy posting a
> brief comparison of the two?
>
> I've just integrated icsv2ledger in my accounting work-flow. I've
> managed to make it do what I want, but I've also found it a bit rough
> around the edges --- I can elaborate more if people on this list are
> interested.
>
> If others have found reckon to be more flexible than icsv2ledger, I'll
> be happy to reconsider my tool choice.

It has been some time since I used icsv2ledger, so this might be
slightly out of date. The main differences between the two are:

1) Reckon's scope is wider in that you can pass it your old ledger files and it
will "learn" from that. It will try to directly match payments
in the csv file to accounts you in your ledger file.

2) Reckon tries to automagically read your csv file, so you have to
specify less when you want to convert the csv file. (If it doesn't
correctly detect your csv format then feel free to file a bug report for
Reckon).

Cheers,

Edwin

>
> TIA,
> Cheers.
> --
> Stefano Zacchiroli . . . . . . . za...@upsilon.cc . . . . o . . . o . o
> Maître de conférences . . . . . http://upsilon.cc/zack . . . o . . . o o
> Former Debian Project Leader . . @zack on identi.ca . . o o o . . . o .
> « the first rule of tautology club is the first rule of tautology club »
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "Ledger" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to ledger-cli+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
--

John Wiegley

unread,
Jun 10, 2014, 2:01:20 PM6/10/14
to ledge...@googlegroups.com
>>>>> Stefano Zacchiroli <za...@upsilon.cc> writes:

> Has anyone here used both reckon and icsv2ledger, and fancy posting a brief
> comparison of the two?

Don't forget to include "ledger convert", the built-in CVS reader.

John

Martin Blais

unread,
Jun 10, 2014, 3:49:36 PM6/10/14
to ledger-cli
Maybe someone could write up a more in-depth comparison of these tools within the LedgerHub design doc:
https://docs.google.com/document/d/11u1sWv7H7Ykbc7ayS4M9V3yKqcuTY7LJ3n1tgnEN2Hk/

(If someone will write it up I'll include it.)






Hans Erik van Elburg

unread,
Jun 10, 2014, 5:14:13 PM6/10/14
to ledge...@googlegroups.com

I tried "reckon", but the learming promiss did not realize at all. The automatic recognition of columns did, but then I had to manually tell reckon how to post all transactions. This caused a lot of work (no autocompletion like in emacs). After this I ended up with a file that was not emacs ledger-mode compliant. Had  to re-edit all transactions in emacs again. So I think the whole process did not really save me time at all, on the contrary.

The "ledger convert" caused less trouble, the extra configuration to tell it what columns to convert is a one time treshold only.

I tend to only use these conversion mechanisms when there is a mismatch that I can not otherwise pin down.

BR,
Hans Erik

Op dinsdag 10 juni 2014 11:53:26 UTC+2 schreef Edwin:

Craig Earls

unread,
Jun 10, 2014, 8:35:57 PM6/10/14
to ledge...@googlegroups.com
I am curious what was not "ledger-mode compliant" about your file. If
there is something the ledger will work with and ledger-mode won't
then I have more work to do.
Craig, Corona De Tucson, AZ
enderw88.wordpress.com

Peter Ross

unread,
Jun 10, 2014, 8:56:00 PM6/10/14
to ledger-cli
On 10 June 2014 19:31, Stefano Zacchiroli wrote:
> On Fri, May 30, 2014 at 09:54:39AM +0200, Jostein Berntsen wrote:
>> I like to use reckon:
>
> Has anyone here used both reckon and icsv2ledger, and fancy posting a
> brief comparison of the two?
>
> I've just integrated icsv2ledger in my accounting work-flow. I've
> managed to make it do what I want, but I've also found it a bit rough
> around the edges --- I can elaborate more if people on this list are
> interested.
>
I'm interested as a very part-time developer of icsv2ledger.

It works well for what I want to do which is to enter 10-20
transactions a week and get the right payee and account most of the
time, and as a developer, writing a regex is not difficult for me to
do.

What I really like is the auto-completion, and what I find painful is
when the auto-completion doesn't work as expected.

Reckon sounds like a very interesting project, and I really like the
idea that it "learns" account name to use.

Someone correct me if I'm wrong, but it doesn't learn payee names, but
just uses the name as is from the csv file.

That personally would drive me nuts, as payee names here in Australia
contain a lot of extraneous information.

To me reckon sounds also a more friendly project for people who have
never heard of regexs.

Pete

Martin Blais

unread,
Jun 10, 2014, 11:23:18 PM6/10/14
to ledger-cli
On Tue, Jun 10, 2014 at 8:55 PM, Peter Ross <pe...@emailross.com> wrote:
On 10 June 2014 19:31, Stefano Zacchiroli wrote:
> On Fri, May 30, 2014 at 09:54:39AM +0200, Jostein Berntsen wrote:
>> I like to use reckon:
>
> Has anyone here used both reckon and icsv2ledger, and fancy posting a
> brief comparison of the two?
>
> I've just integrated icsv2ledger in my accounting work-flow. I've
> managed to make it do what I want, but I've also found it a bit rough
> around the edges --- I can elaborate more if people on this list are
> interested.
>
I'm interested as a very part-time developer of icsv2ledger.

It works well for what I want to do which is to enter 10-20
transactions a week and get the right payee and account most of the
time, and as a developer, writing a regex is not difficult for me to
do.

When you say, "get the right account," this assumes a lot of context, this assumes that the problem is that there is a missing account to guess, a modest subproblem that I call "categorization." For example, such as would be the case for importing credit card transaction history. The more general problem is that of importing any data from any file may not require solving this. About half of the importing that I personally do does not involve any kind of categorization, e.g. importing historical data from my trading account, and I find the rest to be so easy and to consume so little time that I don't even think of it as a real problem (if I'd say I spend more than 10 minutes every two weeks categorizing transactions I'd be exaggerating). It's a bite-size toy problem to play with to implement a little learning classifier on.

More important is avoiding errors, and to that extent, an explicit configuration is what I favor. Something like you find in this section of the Ledgerhub design doc:

For each source of document, you create an importer which has a fixed configuration of accounts that are specific to you.  The importer does no guessing and generates transactions to the accounts you instantiated it with. It's not clear that one could or would want to learn these.  Your configuration changes very little, and you do need to write a small script to configure your importers.

Also more important IMO is, given a file, categorizing where it comes from and what kind of importer code to run on it. I call this "identification" in the LedgerHub doc.




What I really like is the auto-completion, and what I find painful is
when the auto-completion doesn't work as expected.

Reckon sounds like a very interesting project, and I really like the
idea that it "learns" account name to use.

Someone correct me if I'm wrong, but it doesn't learn payee names, but
just uses the name as is from the csv file.

That personally would drive me nuts, as payee names here in Australia
contain a lot of extraneous information.

The problem of extracting meaningful payee names from the crap that lives in those downloadable files is an interesting one. It might be interesting to join efforts and create a corpus of such payee names to work from. In the US and Canada, many of the payee names have the city and name in them:

"BARNES&NOBLE BKSTRE #2000NEW YOR"
"BACO MERCAT              LOS AN" 
"SAQ23220 MONT-ROYAL OU -- MONTREAL QC" 


To me reckon sounds also a more friendly project for people who have
never heard of regexs.

It is a reasonable assumption to me that anybody using a custom computer language to do double-entry accounting will, at least to some extent, be a bit of a nerd.


Hans Erik van Elburg

unread,
Jun 11, 2014, 2:16:13 AM6/11/14
to ledge...@googlegroups.com
Here we go:
- using TAB (align) on a reckon generated posting, eats almost the whole posting, this seems to be caused by literal tab characters in the generated file
- i instructed reckon to use comma format, but the generated output contains decimal dots again which is incompatible with the euro format in my ledger files so i had to convert them back by hand

Used command:
reckon --contains-header --comma-separates-cents --currency '€' --account Bezit:Betaalrekening:XXXXXXX -f XXXXXXX.csv -l ../ledger.ledger -o XXX_output.dat

This generated posting:
2014/03/05    KN: 300031560; YYYYYYYYY; 798051; IC; Af; Incasso; ZAAILING WEBWINKELS 300038898 300031560 VOF ZAAILING?LEIDEN
    Uitgaven:Boodschappen:Eten:GV                    €39,67
    Bezit:Betaalrekening:XXXXXXXX                    -€39,67

is destroyed by ledger-mode after giving a <TAB> it becomes:
2014/03€39,6-€39,67 

Ledger-mode command "Clean-up buffer" destroys the whole file ;-)

I had to manually edit all postings to get rid of this. Of course once you found out it is the literal tab characters one could easily automate this. But still, it is inconvenient.

One could of course also say this is a quirk in ledger-mode, as ledger just normally processes these files.

BR,
Hans Erik

Op woensdag 11 juni 2014 02:35:57 UTC+2 schreef Craig Earls:

Edwin van Leeuwen

unread,
Jun 11, 2014, 4:20:46 AM6/11/14
to ledge...@googlegroups.com
On Tue, 10 Jun 2014 23:16:13 -0700, Hans Erik van Elburg <hanserik....@gmail.com> wrote:
> Here we go:
> - using TAB (align) on a reckon generated posting, eats almost the whole
> posting, this seems to be caused by literal tab characters in the generated
> file
> - i instructed reckon to use comma format, but the generated output
> contains decimal dots again which is incompatible with the euro format in
> my ledger files so i had to convert them back by hand

Just to clarify, you use comma separated values in the ledger file
itself? Reckon indeed uses the comma separated values only when reading
and then outputs in dot separated values. That should be a relative easy
fix. Only downside is that different people want different behaviour (I
need the current behaviour), so will need to think about if it can
easily detect this :). Will file a feature request to remind me to look
at that.

Edwin


--

Stefano Zacchiroli

unread,
Jun 11, 2014, 4:49:41 AM6/11/14
to ledge...@googlegroups.com
On Wed, Jun 11, 2014 at 10:55:59AM +1000, Peter Ross wrote:
> > I've just integrated icsv2ledger in my accounting work-flow. I've
> > managed to make it do what I want, but I've also found it a bit
> > rough around the edges --- I can elaborate more if people on this
> > list are interested.
> >
> I'm interested as a very part-time developer of icsv2ledger.

Cool, here we go then, starting from your comments:

> It works well for what I want to do which is to enter 10-20
> transactions a week and get the right payee and account most of the
> time, and as a developer, writing a regex is not difficult for me to
> do.
>
> What I really like is the auto-completion, and what I find painful is
> when the auto-completion doesn't work as expected.

Both agreed. icsv2ledger's auto completion is very handy. Same for the
history of past decisions, that are re-used to guide future decisions
and are conveniently stored in an editable format. These are great
pluses, which make me generally happy about icsv2ledger.

On the pet peeve front I have:

- icsv2ledger is able to learn account names from ledger files, but is
not able to learn tags or payees from ledger files. Tags/payees are
only learned from past icsv2ledger decisions. This might be due to the
fact that icsv2ledger only uses ledger as "ledger accounts FILE"; I
wonder if it shouldn't instead use something more heavy weight, such
as "ledger xml"

- icsv2ledger UI is kinda "scary". On the one hand, when processing a
lot of transactions I often fear that if something goes wrong (Ctrl-C,
or any other Python exception) I will lose all the work so far. This
hasn't yet happened to me but, as a mere user, I didn't get a feeling
an overall feeling of robustness --- it might be excessive paranoia on
my side, though :)

- OTOH, I'd also like to have an extra (maybe optional) review step
after processing each transaction, that allows me to inspect the
transaction generated by icsv2ledger and possibly go back to change my
decisions about it. In a sense, icsv2ledger seems very prone to
mistyping: if by mistake I type/complete the wrong payee/tag and hit
Enter, the next moment to review my wrong inputs will be the very end
of the process --- which might 15 minutes later, when I'll have forgot
the context. Also, by then icsv2ledger would have learned my wrong
decision, and I'll have to edit the history file to amend my mistake.
It really feels like an extra review/confirm step is missing.

- the interface for adding/removing tags is quite clunky when using tags
with values. To remove a tag like "foo: very long value" you have to
type "-foo: very long value", whereas I'd expect "-foo" to work.

- maybe outside the scope of icsv2ledger, but I've the need of doing
some more advanced processing on my CSV files. For instance, in all
transactions related to credit/debit cards, my bank includes in the
description something like "CARD 12345678", and I want to convert that
to valued tags like "Card: 12345678". That's trivial to do with a
regexp, but icsv2ledger doesn't support it. In a sense, what I lack is
the equivalent of icsv2ledger [FOO_addons] accounts, but with a more
fine-grained granularity than columns.

What I'm now doing is pre-processing my CSV files to add the relevant
columns using regexp-based matches, but it really feels like
icsv2ledger would be the right place for such a feature.

This is all I have for now.

I can't promise patches, but if you think it'd be useful/welcome I can
submit these as bug reports on GitHub.

Thanks for reading thus far :-)

Craig Earls

unread,
Jun 11, 2014, 9:02:56 AM6/11/14
to ledge...@googlegroups.com
I have entered this as a bug against ledger-mode. I should be able to
fix it this weekend.

On Tue, Jun 10, 2014 at 11:16 PM, Hans Erik van Elburg
Message has been deleted

Peter Ross

unread,
Jun 11, 2014, 6:28:10 PM6/11/14
to ledger-cli
It does learn payees and accounts from the file, it doesn't learn tags.

As I don't use tags it's unlikely that I will take the time to add it.

> - icsv2ledger UI is kinda "scary". On the one hand, when processing a
> lot of transactions I often fear that if something goes wrong (Ctrl-C,
> or any other Python exception) I will lose all the work so far. This
> hasn't yet happened to me but, as a mere user, I didn't get a feeling
> an overall feeling of robustness --- it might be excessive paranoia on
> my side, though :)
>
It writes a line to the mappings file each time it finishes processing
a transaction.

I always hit Ctrl-C just after I make a mistake processing a
transaction, because the mapping file will contain all the right data
up to that point I can rerun the tool and it will have remembered all
the correct mappings up to the point where I made a mistake.


> - OTOH, I'd also like to have an extra (maybe optional) review step
> after processing each transaction, that allows me to inspect the
> transaction generated by icsv2ledger and possibly go back to change my
> decisions about it. In a sense, icsv2ledger seems very prone to
> mistyping: if by mistake I type/complete the wrong payee/tag and hit
> Enter, the next moment to review my wrong inputs will be the very end
> of the process --- which might 15 minutes later, when I'll have forgot
> the context. Also, by then icsv2ledger would have learned my wrong
> decision, and I'll have to edit the history file to amend my mistake.
> It really feels like an extra review/confirm step is missing.
>
This is a good idea.


> - the interface for adding/removing tags is quite clunky when using tags
> with values. To remove a tag like "foo: very long value" you have to
> type "-foo: very long value", whereas I'd expect "-foo" to work.
>
Sounds very reasonable to me.

> - maybe outside the scope of icsv2ledger, but I've the need of doing
> some more advanced processing on my CSV files. For instance, in all
> transactions related to credit/debit cards, my bank includes in the
> description something like "CARD 12345678", and I want to convert that
> to valued tags like "Card: 12345678". That's trivial to do with a
> regexp, but icsv2ledger doesn't support it. In a sense, what I lack is
> the equivalent of icsv2ledger [FOO_addons] accounts, but with a more
> fine-grained granularity than columns.
>
Yes I think what is wanted is something like the mapping file being
able to use back references from the regex.

/INTERNET TRANSFER - *\([^-]*\) - PETER ROSS/,/\1/,Assets:Bank

So the line

INTERNET TRANSFER - FOOBAR INC - PETER ROSS would map to payee FOOBAR INC


> What I'm now doing is pre-processing my CSV files to add the relevant
> columns using regexp-based matches, but it really feels like
> icsv2ledger would be the right place for such a feature.
>
> This is all I have for now.
>
> I can't promise patches, but if you think it'd be useful/welcome I can
> submit these as bug reports on GitHub.
>
Please do.

It's unlikely that I will get to any of these anytime soon, but they
are all very reasonable suggestions.

Thanks for taking the time to explain them.

Pete

Stefano Zacchiroli

unread,
Jun 12, 2014, 6:01:01 AM6/12/14
to ledge...@googlegroups.com
On Thu, Jun 12, 2014 at 08:28:08AM +1000, Peter Ross wrote:
> > I can't promise patches, but if you think it'd be useful/welcome I can
> > submit these as bug reports on GitHub.
> >
> Please do.
>
> It's unlikely that I will get to any of these anytime soon, but they
> are all very reasonable suggestions.
>
> Thanks for taking the time to explain them.

Thank you for your feedback!

I've now submitted them at
https://github.com/quentinsf/icsv2ledger/issues?state=open

in...@ez-ledger.com

unread,
Jul 28, 2020, 9:35:22 AM7/28/20
to Ledger
www.ez-ledger.com will automate this process
Reply all
Reply to author
Forward
0 new messages