More than one input file for importers

Oon-Ee Ng

unread,

Dec 18, 2018, 10:31:37 PM12/18/18

to bean...@googlegroups.com

A e-wallet I've just started to use provides two different PDF files. One lists transactions with minimal details, the other lists a subgroup of transactions with full details.

Background - the ewallet is linked with an RFID system used for convenient toll payments at the myriad tolled highways in my locality. It can also be used for other transactions, so it just lists all transactions, but tags the RFID transactions with minimal detail (basically just an ID, date, and time). The app also provides an export listing only the RFID usage which provides more details (in particular which toll gate was used).

Ideally I'd like to merge/match transactions based on both files. This does not seem to be something supported 'as-is' in beancount. Here are my options, as I see them:-

1. Write two importers and manually merge (this sucks).

2. Write two importers, with the one which handles general transactions just skipping all the RFID transactions. This would work, but I lose the ability to cross-check that no odd transactions turned up in one and not the other (the system is under pilot testing and I would not trust their output too much, duplicates etc. have been reported already).

3. Write one importer which only handles the general transactions file, and within that importer open/read the other file.

Any other suggestions, or is there something built-in which would help with this?

kuba jamro

unread,

Dec 24, 2018, 3:58:24 PM12/24/18

to Beancount

How important is it for you to uniquely categorise the individual RFID transactions?

Would they not all just end up in Expenses:Auto:Tolls anyway?

Oon-Ee Ng

unread,

Dec 25, 2018, 3:00:17 PM12/25/18

to bean...@googlegroups.com

From an accounting perspective, not important at all, since as you rightly pointed out they'd all be the same account.

From an audit perspective, quite important, as that's data which can be used for verification if/when there are double or missed charges.

I'd just do without them, but since the information is already there, and importers are one-time efforts (generally speaking) I figured more information is always better than less.

On Tue, Dec 25, 2018 at 4:58 AM kuba jamro <kuba....@gmail.com> wrote:

How important is it for you to uniquely categorise the individual RFID transactions?

Would they not all just end up in Expenses:Auto:Tolls anyway?

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/9c882890-f286-43ac-8c73-a9de55af23fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

kuba jamro

unread,

Dec 25, 2018, 5:15:26 PM12/25/18

to Beancount

That’s fair enough.

In that case I’d definitely go with your option 3 unless you envision your two importer option creating at least one importer that is generic enough to use in other use cases.

kuba jamro

unread,

Dec 25, 2018, 5:24:48 PM12/25/18

to Beancount

And one more option for you. You could write one importer that handles the general file and then write a decorator/decorated class that handles the loading of RFID data while it parses the output of the general importer.

Martin Blais

unread,

Jan 6, 2019, 4:22:09 AM1/6/19

to Beancount

Actually another case that is analogous to this is Amazon, which provides descriptions of the products you bought on a credit card.

One imports their credit card, but may want to have as a side-input an Amazon details file to provide more information on the transactions.

It's possible to write your own main program to call the ingest code.

It's here (read comments):

https://bitbucket.org/blais/beancount/src/tip/beancount/ingest/scripts_utils.py

It's a bit convoluted - this code is intended to support the old invocation (e.g. with bean-extract) and the new one (as a script, with subcommands), and I haven't tried to add custom args, so you may have to figure some stuff out, but that's how I'd do this.

Cheers,

--

You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAGQ70euN_bXHd7xqenO%2BmxvknKDQ4bZm839VdTVSzh6N%3DPsQGQ%40mail.gmail.com.

Oon-Ee Ng

unread,

Jan 9, 2019, 4:08:26 PM1/9/19

to bean...@googlegroups.com

So having actually gotten round to implementing option 3, I'm running again into the minor issue I mentioned in the thread "Nose tests for ingest - do they work OOTB?"[1]

In this case the issue is slightly more severe because in order for the tests to make sense, I do need the secondary file in a predictable location, and being in the same folder is the most predictable location available. I'm going to (after sending this email) work around this by checking for the secondary file type and just returning an empty result in the extract() function, but it may make more sense to just use identify() for testing.

[1] - https://groups.google.com/forum/#!msg/beancount/6D2VEdpsWJc/kmGfyk5xBAAJ

On Sun, Jan 6, 2019 at 5:22 PM Martin Blais <bl...@furius.ca> wrote:

Actually another case that is analogous to this is Amazon, which provides descriptions of the products you bought on a credit card.
One imports their credit card, but may want to have as a side-input an Amazon details file to provide more information on the transactions.
It's possible to write your own main program to call the ingest code.
It's here (read comments):
https://bitbucket.org/blais/beancount/src/tip/beancount/ingest/scripts_utils.py
It's a bit convoluted - this code is intended to support the old invocation (e.g. with bean-extract) and the new one (as a script, with subcommands), and I haven't tried to add custom args, so you may have to figure some stuff out, but that's how I'd do this.

Thanks, something I'll keep in mind, but doesn't make sense (time-wise) to go down this route just yet since I have something working well enough for my needs.

Cheers,

On Wed, Dec 19, 2018 at 2:31 PM Oon-Ee Ng <ngoone...@gmail.com> wrote:
A e-wallet I've just started to use provides two different PDF files. One lists transactions with minimal details, the other lists a subgroup of transactions with full details.

Background - the ewallet is linked with an RFID system used for convenient toll payments at the myriad tolled highways in my locality. It can also be used for other transactions, so it just lists all transactions, but tags the RFID transactions with minimal detail (basically just an ID, date, and time). The app also provides an export listing only the RFID usage which provides more details (in particular which toll gate was used).

Ideally I'd like to merge/match transactions based on both files. This does not seem to be something supported 'as-is' in beancount. Here are my options, as I see them:-

1. Write two importers and manually merge (this sucks).

2. Write two importers, with the one which handles general transactions just skipping all the RFID transactions. This would work, but I lose the ability to cross-check that no odd transactions turned up in one and not the other (the system is under pilot testing and I would not trust their output too much, duplicates etc. have been reported already).

3. Write one importer which only handles the general transactions file, and within that importer open/read the other file.

Any other suggestions, or is there something built-in which would help with this?

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAGQ70euN_bXHd7xqenO%2BmxvknKDQ4bZm839VdTVSzh6N%3DPsQGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To post to this group, send email to bean...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhNWBV8JqJ6-%2BOt%2BEX%2BPP9d1NCnxdbupLJZVa7BnrHw8Xw%40mail.gmail.com.

Reply all

Reply to author

Forward