I've got some text files from my Chinese banks formatted according to
their own system. I'm planning on just using Python to slurp the file
in, but I'll need to write it out to ledger format. What's the best way
of doing that? I didn't see anything in the /python or /contrib folders
that look like ledger writers. Should I just use a string template?
Given that I'm conversant in python and bash, and not likely to be
learning any C/C++ anytime soon (though I can copy-and-paste with the
best of them), what's my best option for writing a ledger file from
python/bash data?
<e...@ericabrahamsen.net> wrote:
> I've got some text files from my Chinese banks formatted according to
> their own system. I'm planning on just using Python to slurp the file
> in, but I'll need to write it out to ledger format. What's the best way
> of doing that? I didn't see anything in the /python or /contrib folders
> that look like ledger writers. Should I just use a string template?
I personally separated the tasks - I have one script that goes from
input file (in my case, PDF's from banks) to a CSV file, then another
script that goes from CSV to ledger. There's a plethora of tools
that go from CSV to ledger.
> Given that I'm conversant in python and bash, and not likely to be
> learning any C/C++ anytime soon (though I can copy-and-paste with the
> best of them), what's my best option for writing a ledger file from
> python/bash data?
I'd figure out whatever templating library you like in Python if you
want to handle that portion of it.
If you want something to start with, I wrote my own in ruby (mainly
because it has the ERB templating engine built in):
https://github.com/zdw/rubycsv
I was using hledger for this, but needed to generate complex
multi-line entries, such as ones with purchase/tax/shipping broken out
from a single CSV line. These are common problems when dealing with
payment processors like Square and Paypal.
I personally use a homemade perl script based on pdftotext, which then parse the text output and write ledger syntax.
As suggested by Zack, I could have done PDF to CSV and then using a CSV to Ledger tool. But none satisfied my needs. Anyway, here is my personal notes about some CSV to Ledger tools. Pros are noted '+', Cons are noted '-'.
Included in John Wiegley's ledger, there is the "convert" command See ./test/baseline/cmd-convert.test - can not rename payee
It also has the advantage that it remembers the renaming of payee (and
account), so that if you have the exact same payee in the future it
automatically suggests your renaming.
It also has the advantage that it does tab completion on account names
and payees, so that you can type just a few letters for the payee and
press tab to get complete payee name.
I personally use this tool (and contributed the rename payee code) and
for me it works how I want.
On Tue, Sep 25 2012, Peter Ross wrote:
> On 25 September 2012 08:19, thierry wrote:
>> icsv2ledger, from Quentin Stafford-Fraser
>> https://github.com/quentinsf/icsv2ledger >> - code does not look to be unicode compliant
>> - can not rename payee
> You can rename the payee now.
> It also has the advantage that it remembers the renaming of payee (and
> account), so that if you have the exact same payee in the future it
> automatically suggests your renaming.
> It also has the advantage that it does tab completion on account names
> and payees, so that you can type just a few letters for the payee and
> press tab to get complete payee name.
> I personally use this tool (and contributed the rename payee code) and
> for me it works how I want.
Thanks for all the responses! Given that I've really only got the one
format to handle, and that I can see what icsv2ledger is doing in terms
of writing ledger entries from Python, I'll probably just skip the
intermediary CSV step and go straight to ledger.
Thanks Peter for this message. I re-discovered the tool, and finally switched to it! The 'i' of icsv2ledger ('i' meaning interactive) is what convinced me. I forked and pulled some changes in github.
I have others changes but they may break some existing functionalities. I write below what I have in mind, that may start a discussion about them: - I want to use icsv2ledger through unix pipes. That means ability to read from stdin (I think this is easy implementing this using fileinput python module), and ability to write to stdout. This latter will break existing way of working, as the default is "csv_filename" which csv extension is replaced with .ledger. I procrastinate on how to implement it: 3 different outputs can be used : stdout, csv_filename.ledger or newfilename. But the unix philosophy would teach me to use only stdout and newfilename. But this break the existing. And also, this may suppress the ability to treat several input files. Any advise welcomed. - I believe that --no-output-tags and --read-file can be moved to config file (instead of executable option)
On Tuesday, September 25, 2012 3:10:18 AM UTC+2, Peter Ross wrote:
> On 25 September 2012 08:19, thierry wrote: > > icsv2ledger, from Quentin Stafford-Fraser > > https://github.com/quentinsf/icsv2ledger > > - code does not look to be unicode compliant > > - can not rename payee
> You can rename the payee now.
> It also has the advantage that it remembers the renaming of payee (and > account), so that if you have the exact same payee in the future it > automatically suggests your renaming.
> It also has the advantage that it does tab completion on account names > and payees, so that you can type just a few letters for the payee and > press tab to get complete payee name.
> I personally use this tool (and contributed the rename payee code) and > for me it works how I want.
On 29 September 2012 22:37, thierry <thierry.dauco...@free.fr> wrote:
> Thanks Peter for this message. I re-discovered the tool, and finally
> switched to it! The 'i' of icsv2ledger ('i' meaning interactive) is what
> convinced me. I forked and pulled some changes in github.
> I have others changes but they may break some existing functionalities. I
> write below what I have in mind, that may start a discussion about them:
> - I want to use icsv2ledger through unix pipes. That means ability to read
> from stdin (I think this is easy implementing this using fileinput python
> module), and ability to write to stdout. This latter will break existing way
> of working, as the default is "csv_filename" which csv extension is replaced
> with .ledger. I procrastinate on how to implement it: 3 different outputs
> can be used : stdout, csv_filename.ledger or newfilename. But the unix
> philosophy would teach me to use only stdout and newfilename. But this break
> the existing. And also, this may suppress the ability to treat several input
> files. Any advise welcomed.
I think that the best option would be to use "-" as the filename which
represents stdin and when it's "-" then just send the output to stdout
rather than opening a file based on the input filename.
> - I believe that --no-output-tags and --read-file can be moved to config
> file (instead of executable option)
I have another change to propose to icsv2ledger. Why is there an account mapping file AND a payee mapping file? Why not merging into a single file? The two files have CSV format, first column is regexp, and second is account or payee. What bother me is that I have to change the regexp in both files. I would propose to have a single CSV file with 3 columns regexp,account,payee, (and if needed a variable in config file to use only one column or the other). Thanks for any feedback.
On Sunday, September 30, 2012 8:08:50 AM UTC+2, Peter Ross wrote:
> On 29 September 2012 22:37, thierry <thierry....@free.fr <javascript:>> > wrote: > > Thanks Peter for this message. I re-discovered the tool, and finally > > switched to it! The 'i' of icsv2ledger ('i' meaning interactive) is what > > convinced me. I forked and pulled some changes in github.
> > I have others changes but they may break some existing functionalities. > I > > write below what I have in mind, that may start a discussion about them: > > - I want to use icsv2ledger through unix pipes. That means ability to > read > > from stdin (I think this is easy implementing this using fileinput > python > > module), and ability to write to stdout. This latter will break existing > way > > of working, as the default is "csv_filename" which csv extension is > replaced > > with .ledger. I procrastinate on how to implement it: 3 different > outputs > > can be used : stdout, csv_filename.ledger or newfilename. But the unix > > philosophy would teach me to use only stdout and newfilename. But this > break > > the existing. And also, this may suppress the ability to treat several > input > > files. Any advise welcomed.
> I think that the best option would be to use "-" as the filename which > represents stdin and when it's "-" then just send the output to stdout > rather than opening a file based on the input filename.
> > - I believe that --no-output-tags and --read-file can be moved to config > > file (instead of executable option)
On Sun, Sep 30, 2012 at 12:31 PM, thierry <thierry.dauco...@free.fr> wrote:
> I have another change to propose to icsv2ledger. Why is there an account
> mapping file AND a payee mapping file? Why not merging into a single file?
The rubycsv code I wrote does this. Actually it has a generic
"tablematch" function that can do an arbitrarily large number of regex
mappings. And it can do multi-entry transactions. And it's not
ledger-specific - it can do CSV to anything if that's a plus for
anyone.
The only downside compared to any of the other CSV programs is that
it's not interactive, and you have to write the mapping tables as a
ruby array.
On Sun, Sep 30, 2012 at 10:29:17PM -0700, Zack Williams wrote:
> On Sun, Sep 30, 2012 at 12:31 PM, thierry <thierry.dauco...@free.fr> wrote:
> > I have another change to propose to icsv2ledger. Why is there an account
> > mapping file AND a payee mapping file? Why not merging into a single file?
> The rubycsv code I wrote does this. Actually it has a generic
> "tablematch" function that can do an arbitrarily large number of regex
> mappings. And it can do multi-entry transactions. And it's not
> ledger-specific - it can do CSV to anything if that's a plus for
> anyone.
> The only downside compared to any of the other CSV programs is that
> it's not interactive, and you have to write the mapping tables as a
> ruby array.
> - Zack
Not to beat a dead horse further, but CSV2Ledger (my perl variant)
does the same thing. Dynamic renaming based on payee, account
matching, even file matching using a YAML format.
It's a common wheel to reinvent.
Best of luck.
------------------------------------------------------------------
Russell Adams RLAd...@AdamsInfoServ.com
On Monday, 1 October 2012 07:24:29 UTC+1, Russell Adams wrote:
> Not to beat a dead horse further, but CSV2Ledger (my perl variant) > does the same thing. Dynamic renaming based on payee, account > matching, even file matching using a YAML format.
> I have a related question to the original poster. At the moment I use
CSV2Ledger to import csv files, but I suspect I'm doing it in a somewhat inefficient way. In particular:
- I have both US and UK accounts, and at the moment before I import the UK ones I create and run an emacs macro on them to convert the dates to US format. Probably an improvement would be to learn to write a script to do this for me automatically, but better would just be to have the importing program do that for me (I guess it looks like the built-in import can do this via --input-date-format, but I don't know if CSV2Ledger can do it).
on my csv file from my First Direct account, and then I go to emacs and change all the Liabilities:Other to what they should be. On the other hand, it would be ideal if the liability could be assigned automatically from the payee - I think 80%+ of my transactions have the same payee as previous ones and should be mapped to the same liability.
Now, at some point about a year ago, before there was more documentation for ledger 3.0, I had the impression that the builtin import function would try to do this automatically - is that right? On the other hand, from the manual on the convert command, it isn't even clear to me that it can assign the Asset name for all the entries from the same csv file (which I definitely need).
So - is there a way to do this, either by the importing program analysing the accounts.dat file, or by manually setting up some rules for converting from payee to liability?
I'm wondering if CSV2Ledger can do this, based on the above remark:
"Dynamic renaming based on payee, account matching, even file matching using a YAML format."
However I'm not sure how this works in practice - do I just need to read the comments in the perl file, learn what YAML is etc?
On Monday, October 1, 2012 8:24:29 AM UTC+2, Russell Adams wrote:
> On Sun, Sep 30, 2012 at 10:29:17PM -0700, Zack Williams wrote: > > On Sun, Sep 30, 2012 at 12:31 PM, thierry <thierry....@free.fr<javascript:>> > wrote: > > The only downside compared to any of the other CSV programs is that > > it's not interactive, and you have to write the mapping tables as a > > ruby array.
> Not to beat a dead horse further, but CSV2Ledger (my perl variant) > does the same thing. Dynamic renaming based on payee, account > matching, even file matching using a YAML format.
I was too using a personal perl script, doing the mapping from a table. But I tried the 'i' of icsv2ledger, and felt in love with this 'i'. On contrary, rubycsv and CSV2ledger does not (yet?) implement this 'interactiveness'. Also CSV2ledger is not unicode compliant, and I am not living in an ASCII world :-).
> On Monday, October 1, 2012 8:24:29 AM UTC+2, Russell Adams wrote:
>> On Sun, Sep 30, 2012 at 10:29:17PM -0700, Zack Williams wrote:
>> > On Sun, Sep 30, 2012 at 12:31 PM, thierry <thierry....@free.fr> wrote:
>> > The only downside compared to any of the other CSV programs is that
>> > it's not interactive, and you have to write the mapping tables as a
>> > ruby array.
>> Not to beat a dead horse further, but CSV2Ledger (my perl variant)
>> does the same thing. Dynamic renaming based on payee, account
>> matching, even file matching using a YAML format.
> I was too using a personal perl script, doing the mapping from a table. But
> I tried the 'i' of icsv2ledger, and felt in love with this 'i'. On contrary,
> rubycsv and CSV2ledger does not (yet?) implement this 'interactiveness'.
> Also CSV2ledger is not unicode compliant, and I am not living in an ASCII
> world :-).
It was the interactiveness that was the key feature for me as well. I
came from using MS Money, and so was used to it automatically
suggesting payees and accounts from the string supplied by the bank,
so I wanted something similar when I processed my download from the
bank.
> > I was too using a personal perl script, doing the mapping from a table. But
> > I tried the 'i' of icsv2ledger, and felt in love with this 'i'. On contrary,
> > rubycsv and CSV2ledger does not (yet?) implement this 'interactiveness'.
> > Also CSV2ledger is not unicode compliant, and I am not living in an ASCII
> > world :-).
> It was the interactiveness that was the key feature for me as well. I
> came from using MS Money, and so was used to it automatically
> suggesting payees and accounts from the string supplied by the bank,
> so I wanted something similar when I processed my download from the
> bank.
Interactive sounds like a nice feature! I hadn't tried that.
Normally I import hundreds of records... So my focus isn't on one at a
time data entry, I want bulk and standardized records.
I'm keen on John's new emacs mode that's letting me query my files for
items to clear / update (ie: my queue in my workflow). I can now see
one line per transaction, and hit enter on them to jump to the right file.
------------------------------------------------------------------
Russell Adams RLAd...@AdamsInfoServ.com
>> > I was too using a personal perl script, doing the mapping from a table. But
>> > I tried the 'i' of icsv2ledger, and felt in love with this 'i'. On contrary,
>> > rubycsv and CSV2ledger does not (yet?) implement this 'interactiveness'.
>> > Also CSV2ledger is not unicode compliant, and I am not living in an ASCII
>> > world :-).
>> It was the interactiveness that was the key feature for me as well. I
>> came from using MS Money, and so was used to it automatically
>> suggesting payees and accounts from the string supplied by the bank,
>> so I wanted something similar when I processed my download from the
>> bank.
> Interactive sounds like a nice feature! I hadn't tried that.
> Normally I import hundreds of records... So my focus isn't on one at a
> time data entry, I want bulk and standardized records.
icsv2ledger does have a mode where if it can match the payee
automatically then it doesn't prompt the user but just uses the data
immediately. In that case you only get prompted for payees that
you've never seen before.
I don't use that because I'm not processing so many records that it's
such a big deal.
However I still suggest use the tool which best fits your workflow,
and the joy of using open formats means we can all use the best tool
that fulfils our needs.
On Mon, Oct 1, 2012 at 2:40 PM, thierry <thierry.dauco...@free.fr> wrote:
> I was too using a personal perl script, doing the mapping from a table. But
> I tried the 'i' of icsv2ledger, and felt in love with this 'i'. On contrary,
> rubycsv and CSV2ledger does not (yet?) implement this 'interactiveness'.
> Also CSV2ledger is not unicode compliant, and I am not living in an ASCII
> world :-).
My take on this is that implementing an interactive interface wasn't
worth the added complexity of handling and storing user input (rubycsv
is only 70 lines per sloccount, less than 100 if you include a complex
template).
That's not to say that interactive it's not a nice feature - it was
just a lot more work and didn't match my workflow which has fairly
deep categories that I'd prefer not to mistype.
My solution is this - rubycsv has the concept of a default value to
assign if no better match is found in the table. This is handy in
many ways (unspecified currency being another), but one way I use it
is to assign "Unknown" as the category. The process works as
follows:
1. Run the CSV -> ledger conversion.
2. Run "ledger -f infile.lgr print Unknown" to show all unidentified
transactions.
3. Add appropriate entries to the match table via text editor.
4. Repeat steps 1-3 until no transactions are in the "Unknown" category.
On Tue, Oct 02 2012, Russell Adams wrote:
>> > I was too using a personal perl script, doing the mapping from a table. But
>> > I tried the 'i' of icsv2ledger, and felt in love with this 'i'. On contrary,
>> > rubycsv and CSV2ledger does not (yet?) implement this 'interactiveness'.
>> > Also CSV2ledger is not unicode compliant, and I am not living in an ASCII
>> > world :-).
>> It was the interactiveness that was the key feature for me as well. I
>> came from using MS Money, and so was used to it automatically
>> suggesting payees and accounts from the string supplied by the bank,
>> so I wanted something similar when I processed my download from the
>> bank.
> Interactive sounds like a nice feature! I hadn't tried that.
> Normally I import hundreds of records... So my focus isn't on one at a
> time data entry, I want bulk and standardized records.
> I'm keen on John's new emacs mode that's letting me query my files for
> items to clear / update (ie: my queue in my workflow). I can now see
> one line per transaction, and hit enter on them to jump to the right
> file.
I realized, with no small embarrassment, that I don't really need an
external tool, since I use emacs for maintaining my ledger files, and
for everything else. In the course of writing some elisp functions to
consume bank records I noticed the new version of ledger-mode, and am
considering using it -- do you have any pointers or things to watch out
for? I noticed strings in there that still point to directories on
John's computer :)
Sorry - I wrote this two days ago and it's been held up in moderation, I guess.
Since then I tried icsv2ledger, and managed to do most of the below (I'm sure csv2ledger can also do it, of course), so I now have two slightly different questions, one hopefully almost trivial.
- is there an easy way in all of this (perhaps in Ledger itself?) to change the sign of all of the transactions for some account? One of my credit cards uses the opposite sign convention to all my other accounts.
- how do people deal with avoiding duplicates? Several of these importing options produce MD5 hashes from the original line in the csv file, which should presumably make this straightforward.
With my new icsv2ledger setup, each account now produces a ledger file, and I then manually copy it into my main accounts file. Of course, I could write a script to append it to the main accounts file, but is there a better way to import the transactions, avoiding any duplicates, to deal with csv files with overlapping date ranges?
On 1 Oct 2012, at 11:01, Toby Gee <toby...@gmail.com> wrote:
> So - is there a way to do this, either by the importing program analysing the accounts.dat file, or by manually setting up some rules for converting from payee to liability?
> I'm wondering if CSV2Ledger can do this, based on the above remark:
> "Dynamic renaming based on payee, account matching, even file matching using a YAML format."
> However I'm not sure how this works in practice - do I just need to read the comments in the perl file, learn what YAML is etc?
On Wed, Oct 03, 2012 at 04:27:09PM +0100, Toby Gee wrote:
> Sorry - I wrote this two days ago and it's been held up in moderation, I guess.
> Since then I tried icsv2ledger, and managed to do most of the below
> (I'm sure csv2ledger can also do it, of course), so I now have two
> slightly different questions, one hopefully almost trivial.
> - is there an easy way in all of this (perhaps in Ledger itself?) to
> - change the sign of all of the transactions for some account? One
> - of my credit cards uses the opposite sign convention to all my
> - other accounts.
CSV2Ledger has a negate option you can set based on input filename or
on the command line.
> - how do people deal with avoiding duplicates? Several of these
> - importing options produce MD5 hashes from the original line in the
> - csv file, which should presumably make this straightforward.
CSV2Ledger takes an md5sum of the original csv line and stores both
with the generated txn as metadata. Those md5sums are used to dedup
new data. There's even a caching option for working with many large
files to speed up processing.
I rely heavily on this, because when I download my credit card data it
always overlaps.
> With my new icsv2ledger setup, each account now produces a ledger
> file, and I then manually copy it into my main accounts file. Of
> course, I could write a script to append it to the main accounts
> file, but is there a better way to import the transactions, avoiding
> any duplicates, to deal with csv files with overlapping date ranges?
I use the concept of a queue file (Queue.dat) that all my imports
goto. Once they are assigned a primary expense report, they are moved
by a separate script to an expense report specific file and that is
added to my include list.
> On 1 Oct 2012, at 11:01, Toby Gee <toby...@gmail.com> wrote:
> > So - is there a way to do this, either by the importing program
> > analysing the accounts.dat file, or by manually setting up some
> > rules for converting from payee to liability?
CSV2Ledger supports several rules layers:
- Preprocess: Arbitrary regexp replacements to perform on the CSV
data prior to converting. Useful for payee renaming.
- File level: Sets command line arguments by filename (ie: import
columns, negation, fuzzy date matching, receipt file tagging
parameters, default accounts, etc)
- Account matching by payee: Using a regexp by payee, assign source
and destination accounts.
> > I'm wondering if CSV2Ledger can do this, based on the above
> > remark:
> > "Dynamic renaming based on payee, account matching, even file
> > matching using a YAML format."
> > However I'm not sure how this works in practice - do I just need
> > to read the comments in the perl file, learn what YAML is etc?
Enjoy!
------------------------------------------------------------------
Russell Adams RLAd...@AdamsInfoServ.com
On 4 October 2012 01:27, Toby Gee <toby...@gmail.com> wrote:
> Sorry - I wrote this two days ago and it's been held up in moderation, I guess.
> Since then I tried icsv2ledger, and managed to do most of the below (I'm sure csv2ledger can also do it, of course), so I now have two slightly different questions, one hopefully almost trivial.
> - is there an easy way in all of this (perhaps in Ledger itself?) to change the sign of all of the transactions for some account? One of my credit cards uses the opposite sign convention to all my other accounts.
Not with icsv2ledger currently.
> - how do people deal with avoiding duplicates? Several of these importing options produce MD5 hashes from the original line in the csv file, which should presumably make this straightforward.
Yes but icsv2ledger doesn't do anything with these hashes.
I presume the idea was to run ledger looking for transactions tagged
with the hash, and if there is one then icsv2ledger should ignore the
transaction. Patches welcome.
> With my new icsv2ledger setup, each account now produces a ledger file, and I then manually copy it into my main accounts file. Of course, I could write a script to append it to the main accounts file, but is there a better way to import the transactions, avoiding any duplicates, to deal with csv files with overlapping date ranges?
Implement the above code and add it to icsv2ledger (if that's the tool
you wish to continue to use), otherwise it has to be a manual process,
or switch to another tool.
> - is there an easy way in all of this (perhaps in Ledger itself?) to change the sign of all of the transactions for some account? One of my credit cards uses the opposite sign convention to all my other accounts.
This feature is in many of the CSV to ledger conversion tools already.
> - how do people deal with avoiding duplicates? Several of these importing options produce MD5 hashes from the original line in the csv file, which should presumably make this straightforward.
One solution that's useful but not pure is the "transfer account
pattern" I wrote about on this list 2012-02-19.
Basically, if you have say a credit card and a checking account, and
pay one from the other, you create a 3rd "transfer" account that, when
both are balanced, has a zero balance.
> With my new icsv2ledger setup, each account now produces a ledger file, and I then manually copy it into my main accounts file. Of course, I could write a script to append it to the main accounts file, but is there a better way to import the transactions, avoiding any duplicates, to deal with csv files with overlapping date ranges?
For the first part of this, Ledger supports including other files via:
!include /path/to/file
I use this heavily. For the latter parts... you'll have to figure
out what to do. Personally, I don't download overlapping date ranges
(as I scrape end of month PDF statements), so it's not a problem.