Best way to import transactions when parsing directly a bank website

83 views
Skip to first unread message

timoth...@gmail.com

unread,
Sep 8, 2019, 4:29:15 PM9/8/19
to Beancount
Hi,

*Heads'up: I am not a develloper. I have written my first line of python 3 weeks ago for my first beancount. Sorry if my question is stupid*

I am trying to import 10 years of history from one of my asset account (life insurance).
Unfortunately the account holder does not provide any usable digital statements (neither pdf, csv, ofx...) that i can use.

I have finally decided to learn to use beautifulsoup, and i have built a crude script to browse through all the pages of my account and grab the details of the 500 operations made over the last 10 years.
Luckily for me, the login system of the bank is quite basic, and i have not met any difficulty to build the script.
All the relevant information are now arranged in a huge dict

Now, i need to create the corresponding transaction in beancount.

The typical importer principle is not an ideal fit, as there is no "file" to import.

I see now several options:
- Building an importer that will be triggered by a dummy file and then run my web parser.
- From my dict, saving the data in a file, and then, build a separate importer.
- Directly create the beancount entries without using the importer process flow. (Is there some module from beancount that could help ?)

What would be the canon way to do that ?

Martin Blais

unread,
Sep 8, 2019, 5:08:51 PM9/8/19
to Beancount
On Sun, Sep 8, 2019 at 4:29 PM <timoth...@gmail.com> wrote:
Hi,

*Heads'up: I am not a develloper. I have written my first line of python 3 weeks ago for my first beancount. Sorry if my question is stupid*

There are no stupid questions. (I mean that. Your participation is welcome here.)


I am trying to import 10 years of history from one of my asset account (life insurance).
Unfortunately the account holder does not provide any usable digital statements (neither pdf, csv, ofx...) that i can use.

I have finally decided to learn to use beautifulsoup, and i have built a crude script to browse through all the pages of my account and grab the details of the 500 operations made over the last 10 years.
Luckily for me, the login system of the bank is quite basic, and i have not met any difficulty to build the script.
All the relevant information are now arranged in a huge dict

Now, i need to create the corresponding transaction in beancount.

The typical importer principle is not an ideal fit, as there is no "file" to import.

I see now several options:
- Building an importer that will be triggered by a dummy file and then run my web parser.
- From my dict, saving the data in a file, and then, build a separate importer.
- Directly create the beancount entries without using the importer process flow. (Is there some module from beancount that could help ?)

What would be the canon way to do that ?

Option (1) is overkill IMO.
Option (2) is like (3), really it is just a way to avoid scraping every time. That's worthwhile if it's slow or if you encounter detection limits from the server.
If I were you I would do (2/3), just create the Beancount entries once and print them and move on.

Why? For many reasons:
- These insurance transactions are likely to be infrequent
- Their amounts are probably a small part of your total assets
- This is a very long-term concern
For those reasons, I don't think it's a great use of your time to spend it to fully automate a regular import process for this.

This is kind-of like my PayPal account, I personally don't use it much. I'll import it once/year.
Contrast that to your main credit card and checking account. Or perhaps your main active trading account.

So I would just build and print out the entries now and once, cut-and-paste to your file, and then run the script once or twice per year.
You've already done the hard part of the work; get your current snapshot of the data in and be content.

timoth...@gmail.com

unread,
Sep 9, 2019, 6:04:11 AM9/9/19
to Beancount
Thanks for your detailed answer.
I will proceed with method number 2, and i will let you know if i am facing difficulties.

Keep up the good work! Beancount is amazing!

Bruno Pellizzetti

unread,
Sep 22, 2019, 8:47:19 AM9/22/19
to Beancount
Hey man, whats up?
I'm having the same issue.
Did your solution solve it?
Can you share with us?
Regards!

timoth...@gmail.com

unread,
Sep 29, 2019, 8:37:55 AM9/29/19
to Beancount
Hi !
Yes, i have achieved what i wanted.
It is shared here:
https://github.com/grostim/Beancount-myTools

So, here is my method:
I have a script called "generali.py". It connect to the generali website and gather all the new operation since the latest succesful import.
Each operation is exported to a json file.

I have then a standard importer for beancount (named jsongenerali) that import each of theses operation in beancount.

My code is very crude and probably full of bugs, but until now, it works pretty well for my need. I have imported several hundreds of operations successfully.

Reply all
Reply to author
Forward
0 new messages