Hello,
I am new into beancount / ledger and currently think about how to do my importing. I have written an importer for the csv statements from my bank. Two question I have:
+ I would like to automatically rename the payee of some frequently occurring transaction, such as shopping groceries and assign them to accounts. Is there a canonical way do that or should I just hack it into the importer?
+ How can I detect duplicates when I try to import the same transaction twice?
Thanks,
Florian
Hello,
I am new into beancount / ledger and currently think about how to do my importing. I have written an importer for the csv statements from my bank. Two question I have:
+ I would like to automatically rename the payee of some frequently occurring transaction, such as shopping groceries and assign them to accounts. Is there a canonical way do that or should I just hack it into the importer?
+ How can I detect duplicates when I try to import the same transaction twice?
--
Thanks,
Florian
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/42c62fc8-64b9-48c6-a5b1-ca135532059d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Am Sonntag, 28. April 2019, 01:44:07 CEST schrieb Martin Blais:
> On Sat, Apr 27, 2019 at 6:28 PM Florian Lindner <mailin...@xgm.de> wrote:
>
> > Hello,
> >
> >
> > I am new into beancount / ledger and currently think about how to do my
> > importing. I have written an importer for the csv statements from my bank.
> > Two question I have:
> >
> >
> > + I would like to automatically rename the payee of some frequently
> > occurring transaction, such as shopping groceries and assign them to
> > accounts. Is there a canonical way do that or should I just hack it into
> > the importer?
> >
>
> You should built it into your importer.
> This task is simple enough there's really no need for the library to
> provide common code to do that.
> You can roll your own.
Ok, you're right, that's easy.
> + How can I detect duplicates when I try to import the same transaction
> > twice?
> >
>
> That's a more difficult question.
> You should implement your own duplication detection code.
> It's not entirely obvious how to do this for everybody; the definition of
> what's a duplicate depends on how much you manually massage your
> transactions.
> I haven't really tried very hard to generalize this well, so it's best you
> define your own code for that.
Some brainstorming:
+ When beancount/fava talk about duplicates, it seems that it mostly refers to duplicate transactions created by transferring from credit card to checkings and import statements for both.
+ Save the original CSV line as metadata "source-line:". Alternatively, build some unique tuple of (original payee, date, amount) and save that as meta data. For each entry to import, query beancount for an with matching metadata. Ledger does it like that when --rich-data is given. It computes a hash (called UUID) from the input line. Is there a distinct name of that metadata field you suggest? Fava mentions a __source__ key, but that seems to be removed before commiting (https://github.com/beancount/fava/blob/master/fava/help/import.md).
+ Using the payee from beancount is not a good idea, as it usually has been modified manually.
What are your thoughts?
Best,
Florian
Hi Florian,
Am Sonntag, 28. April 2019, 01:44:07 CEST schrieb Martin Blais:
> On Sat, Apr 27, 2019 at 6:28 PM Florian Lindner <mailin...@xgm.de> wrote:
>
> > Hello,
> >
> >
> > I am new into beancount / ledger and currently think about how to do my
> > importing. I have written an importer for the csv statements from my bank.
> > Two question I have:
> >
> >
> > + I would like to automatically rename the payee of some frequently
> > occurring transaction, such as shopping groceries and assign them to
> > accounts. Is there a canonical way do that or should I just hack it into
> > the importer?
> >
>
> You should built it into your importer.
> This task is simple enough there's really no need for the library to
> provide common code to do that.
> You can roll your own.
Ok, you're right, that's easy.
You might also want to have a look at smart importer
https://github.com/beancount/smart_importer
This has some machine learning based approaches to automatically
set payees and accounts
> + How can I detect duplicates when I try to import the same transaction
> > twice?
> >
>
> That's a more difficult question.
> You should implement your own duplication detection code.
> It's not entirely obvious how to do this for everybody; the definition of
> what's a duplicate depends on how much you manually massage your
> transactions.
> I haven't really tried very hard to generalize this well, so it's best you
> define your own code for that.
Some brainstorming:
+ When beancount/fava talk about duplicates, it seems that it mostly refers to duplicate transactions created by transferring from credit card to checkings and import statements for both.
+ Save the original CSV line as metadata "source-line:". Alternatively, build some unique tuple of (original payee, date, amount) and save that as meta data. For each entry to import, query beancount for an with matching metadata. Ledger does it like that when --rich-data is given. It computes a hash (called UUID) from the input line. Is there a distinct name of that metadata field you suggest? Fava mentions a __source__ key, but that seems to be removed before commiting (https://github.com/beancount/fava/blob/master/fava/help/import.md).
+ Using the payee from beancount is not a good idea, as it usually has been modified manually.
What are your thoughts?
There's actually some infrastructure around for this in core beancount and some more with the smart_importer
https://github.com/beancount/smart_importer/blob/master/smart_importer/detector.py
DuplicateDetector will set the correct __duplicate__ metadata based on a specified matching algorithm
apply_hooks(MyImporter(), [PredictPostings(), DuplicateDetector()]),
The default algorithm compares stuff like amount, accounts and dates but you can also customize it. e.g. I have cases where I actually get a reference number and want to use that one, I store the reference number into the meta as 'ref'
class ReferenceDuplicatesComparator: def __call__(self, entry1, entry2): return 'ref' in entry1.meta and 'ref' in entry2.meta and entry1.meta['ref'] == entry2.meta['ref']
apply_hooks(MyImporter(), [PredictPostings(), DuplicateDetector(comparator=ReferenceDuplicatesComparator())]),
Regards,
Patrick