predict postings (e.g. smart_importer) hook for beangulp

282 views
Skip to first unread message

Justus Pendleton

unread,
Mar 6, 2025, 2:09:11 AM3/6/25
to Beancount
I searched previous posts but couldn't find anyone that had contributed a beangulp hook that mimics the "predict postings" thing from legacy smart_importer.

This took me way longer than expected since I have no idea what I'm doing but it's been working for me for a few days.

Stefano Zacchiroli

unread,
Mar 6, 2025, 5:04:35 AM3/6/25
to bean...@googlegroups.com
On Wed, Mar 05, 2025 at 11:09:10PM -0800, Justus Pendleton wrote:
> I searched previous posts but couldn't find anyone that had contributed a
> beangulp hook that mimics the "predict postings" thing from legacy
> smart_importer.

In what sense is smart_importer "legacy"? Has it been declared abandoned
/ incompatible with v3 or ....?

Thanks!
--
Stefano Zacchiroli . za...@upsilon.cc . https://upsilon.cc/zack _. ^ ._
Full professor of Computer Science o o o \/|V|\/
Télécom Paris, Polytechnic Institute of Paris o o o </> <\>
Co-founder & CSO Software Heritage o o o o /\|^|/\
Mastodon: https://mastodon.xyz/@zacchiro '" V "'

Patrick Ruckstuhl

unread,
Mar 6, 2025, 5:09:41 AM3/6/25
to bean...@googlegroups.com
Hi,

No it's not. It's running great with beancount 3, beangulp and fava with the last upgrade (did the changes after fava supported it). There are still some potential improvements to maybe drop the smartimporter hooks and use standard beangulp hooks but right now it works well.

Regards,
Patrick

Justus Pendleton

unread,
Mar 6, 2025, 6:06:24 AM3/6/25
to Beancount
I consider "doesn't work with beangulp hooks" as "doesn't work with beangulp". YMMV.

Red S

unread,
Mar 9, 2025, 4:41:24 AM3/9/25
to Beancount

Out of curiosity: I’m sure it’s just me, but I don’t understand what about smart_importer is not working with beangulp? smart_importer.apply_hooks is what I was using to apply smart_importer.PredictPostings, and that works equally well with beangulp as it did with v2’s ingest. <scratching head>??

Justus Pendleton

unread,
Mar 9, 2025, 4:45:49 PM3/9/25
to Beancount
smart_importer.apply_hooks works by monkey patching an Importer's extract method. I don't think anyone would ever argue that monkey patching methods in Python is anything other than a method of last resort to use in the absence of a better defined API.

beangulp provides an API, which smart_importer.apply_hooks doesn't use.

smart_importer doesn't use beangulp's API because the "shape" of the API is different from the smart_importer implementation. smart_importer expects to be called with (importer: Importer, file: str, imported_entries: data.Directives, existing: data.Directives) but that's not what beangulp calls hooks with. It calls them with (extracted: list[import_file:str, imported_entries: data.Directives, import_account:str, importer: Importer], existing: data.Directives).

That is: smart_importer works on a single account -- if you want to use it on multiple accounts you are expected to create multiple instances. But beangulp hooks are expected to work on multiple accounts -- if they need to only operate on some subset of accounts it is up to them to skip those during processing.

By not being a beangulp hook and instead monkey patching the importer's extract, it means that smart_importer gets invoked at a different point in time than beangulp hooks. In particular, it gets invoked before beangulp's deduplication instead of after. And since it monkey patches the importer's extract instead of being passed in a list to beangulp.Ingest(importers, hooks) it means the user can't control when the hook is run relative to other hooks. That means you can't, for instance, run a "clean up my bank's weird naming of payees before running ML training" hook before running smart_importer.

Red S

unread,
Mar 10, 2025, 12:58:05 AM3/10/25
to Beancount
Makes complete sense now, and much appreciate the detailed answer! And beangulp-hooks as well.

kprab...@gmail.com

unread,
Mar 11, 2025, 8:42:54 AM3/11/25
to Beancount
Hi Justus,

A few hours ago i tried using the https://github.com/beancount/smart_importer with my beangulp based importer( https://github.com/prabusw/beancount-importers-india/blob/master/prabu/import_prabu.py) to handle 1000+ postings from a bank account and i could not get it working. 

I come here to search and find your posting and https://github.com/hoostus/beangulp-hooks works perfectly.

I just made two changes to the above import_prabu.py file as shown below and it works great.
+ from hoostus.beangulp.hooks import predict_posting
-hooks = [process_extracted_entries,]
+hooks = [predict_posting.simple_hook]

Thank you so much for creating and sharing here in the mailing list. Hope these changes get merged to https://github.com/beancount/smart_importer so that it supports beangulp. I'm still finding it difficult to understand the changes between v2 and v3 as i don't have programming background.

Regards,
Prabu

kprab...@gmail.com

unread,
Mar 11, 2025, 9:29:27 AM3/11/25
to Beancount
Hi,

Just an addendum to my earlier message. To get https://github.com/beancount/smart_importer to work with https://github.com/prabusw/beancount-importers-india/blob/master/prabu/import_prabu.py, the following changes have to be made:

+from smart_importer import apply_hooks, PredictPayees, PredictPostings
 
+smart_icici = icici.IciciBankImporter("Assets:IN:ICICIBank:Savings","XXXX")
+apply_hooks(smart_icici, [PredictPostings(), PredictPayees()])

 importers = [
-    icici.IciciBankImporter("Assets:IN:ICICIBank:Prabu","XXXX"),
+    smart_icici,
..
]

Basically the individual importer has to be taken out and assigned a new name, applied hook and has to be called with new name for it to work with smart_importer. I'm not sure if i've understood correctlty, but this was quite difficult for me to understand initially. After using https://github.com/hoostus/beangulp-hooks, i could go back and get the above working.

Thanks again to everyone,
Prabu

Patrick Ruckstuhl

unread,
Mar 11, 2025, 1:58:56 PM3/11/25
to bean...@googlegroups.com
Hi,

Yes that's it. If someone would be willing to create a pr for making smart importer work with the beangulp hooks instead of the old way, I'm sure we can get it merged and released quickly. At the moment I'm busy with other stuff so I don't have the capacity to look into doing the changes myself.

Regards,
Patrick

Patrick Ruckstuhl

unread,
May 23, 2025, 3:50:07 PM5/23/25
to bean...@googlegroups.com

Hi,


New version of smart_importer has now this functionality out of the box.


from your_custom_importer import MyBankImporter
from smart_importer import PredictPayees, PredictPostings

CONFIG = [
    MyBankImporter('whatever', 'config', 'is', 'needed'),
]

HOOKS = [
    PredictPostings().hook,
    PredictPayees().hook
]


Besides the implementation as a standard hook, there is also a new (hopefully much simpler and clearer) way to just wrap single importers.


from your_custom_importer import MyBankImporter
from smart_importer import PredictPayees, PredictPostings

CONFIG = [
    PredictPostings().wrap(
        PredictPayees().wrap(
            MyBankImporter('whatever', 'config', 'is', 'needed')
        )
    ),
]

HOOKS = [
]


See  https://github.com/beancount/smart_importer/ for details


If you're using smart_importer together with fava, please make sure to upgrade to the latest release >= v1.30.3 (there was a change required for fava to run the beangulp hooks which have more arguments).


Regards,

Patrick

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beancount/BAD95D51-D9B2-4E48-9AE4-A1B032B81FCB%40tario.org.

Justus Pendleton

unread,
May 24, 2025, 11:08:29 PM5/24/25
to Beancount
Great work!
Reply all
Reply to author
Forward
0 new messages