SimpleFIN and Multiple-account importers

218 views
Skip to first unread message

Paul Walker

unread,
Jan 9, 2025, 3:28:16 PMJan 9
to Beancount
Hey all,

I'm curious what importer tricks anyone has for statements with multiple accounts. Aggregators like SimpleFIN (recently discovered, a great stand-in for banks dropping ofx~ofxtools/ofxget support) pull many unrelated accounts into one export file. The beangulp-required account function makes this seem antipattern ("which account?"). This also applies to some PDFs (like Fidelity which groups all retirement/non-retirement into a pair of PDFs), but I imagine many of those at least share a common base/parent account.

My current solution is to input a dict of all expected accounts, but again is awkward for the self.account function (I don't actually use "archive" workflow) and is making me update my out_of_place deduplicator which catches manually-created expenses on the wrong credit/debit card. It just doesn't isolate context and messes with the overall extract.

The alternative I've considered is to avoid multiple-account statements. SimpleFIN can get individual accounts, I believe that's in the ofx spec too. So then I'd just get account-specific extracts and initialize an importer for each. But then I remembered the likely more common but more difficult to split multi-account PDFs and thought to share and see if the community had other ideas.

Paul

Brian Lalor

unread,
Jan 9, 2025, 11:49:58 PMJan 9
to bean...@googlegroups.com
I’m interested in this, as well; I’ve signed up for a SimpleFIN account but haven’t actually used it in anger, yet. 

Do you have a sample config showing how your importer’s used? I’d like to copy some of your work. :-)

What’s your workflow? A separate script that downloads from SimpleFIN and dumps the JSON into an imports directory?

I also wish beangulp weren’t so tied to individual files; it makes writing an importer that works with non-file sources awkward, at best, and I think that’s reflected in the singular account restriction, as well…

— 
Brian Lalor (he/him)

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beancount/cd8f42de-dc67-446b-985b-cdcdc79b25f0n%40googlegroups.com.

David Avraamides

unread,
Jan 10, 2025, 3:29:27 AMJan 10
to Beancount
I've been experimenting with SimpleFin for a few weeks. I have one script which runs each morning and downloads the data for all my accounts, saving it directly in the JSON format returned by SimpleFin. I have a separate importer that uses a configuration file to define the different accounts. It's just a list of organization, SimpleFin account ID and Beancount account name. I have 6 institutions connected covering over 20 accounts.

But I'm not using this in production yet as I'm trying to write the equivalent of Red S's smart importer to predict the category account.

Paul Walker

unread,
Jan 10, 2025, 5:18:35 AMJan 10
to Beancount
I start with either ofxget or this simplefin shell script to download transactions. Simplefin couldn't be easier - pull access token url out of keyring (secret-tool), curl it, jq the json for human readability, echo the file because I'll forget where I left it.

#!/bin/bash
start_date="$(date -d '2 weeks ago' +%s)"
# secret-tool store --label "SimpleFIN" service SimpleFIN application Beancount
url="$(secret-tool lookup service SimpleFIN application Beancount)"
curl --silent "$url/accounts?start-date=$start_date" | jq > ~/Downloads/simplefin.json
echo ~/Downloads/simplefin.json

I get 2 weeks of context because I'm usually behind schedule and it feeds context for my out-of-place/reverse deduplication routine. Then this is the business end of my beangulp script. simplefin takes a dict of anything might be in the export, and my simple "decorator" categorization class:

import beangulp
import beancount_utils.importers as ix
from beancount_utils.decorator import Decorator
decorator = Decorator.from_yaml('lib/decorations.yml')
ingest = beangulp.Ingest(
    importers=[
        ix.simplefin.Importer({
            "ACT-abc123": "Assets:Liquid:Wealthfront:Cash",
            "ACT-def456": "Liabilities:Credit:Chase",
        }, decorate=decorator.decorate),
    ]
)
ingest()

For automatic categorization, the decorator has been working really well for me. "decorations" are just objects with regular expressions to match payees. If it matches, tags/narration/add'l accounts, etc are added to the transaction. I do this after deduplication so manually changing target account doesn't interfere. I need to expand it for per-importer/account-specific matches. Simple config, quick to amend:

- payee: Walmart
  re: WALMART|WM SUPERCENTER|WAL-MART
  narration: Groceries
  target_account: 'Expenses:Food:Groceries'

- payee: Wells Fargo
  re: WF HOME MTG-AUTO PAY
  narration: Mortgage payment
  tags: [Autopay]

Paul

Daniele Nicolodi

unread,
Jan 10, 2025, 9:31:42 AMJan 10
to bean...@googlegroups.com
On 09/01/25 16:28, Paul Walker wrote:
> Hey all,
>
> I'm curious what importer tricks anyone has for statements with multiple
> accounts. Aggregators like SimpleFIN <https://beta-bridge.simplefin.org/
> > (recently discovered, a great stand-in for banks dropping
> ofx~ofxtools/ofxget <https://ofxtools.readthedocs.io/en/latest/>
> support) pull many unrelated accounts into one export file. The
> beangulp-required account function makes this seem antipattern ("which
> account?"). This also applies to some PDFs (like Fidelity which groups
> all retirement/non-retirement into a pair of PDFs), but I imagine many
> of those at least share a common base/parent account.

As you probably figured out, the account returned by importers following
the beangulp interface is only used for logging and for archival of the
imported files.

If you don't use the archival function, you can set the account to any
string you like. If you use the archival function, just set it to an
account-like string that defines the path in the documents directory
where you want the statements from which you are importing stored.

For importing from statements that contain information from multiple
accounts, I use two patters, depending on how I want the transactions to
be grouped and how I want the statements to be archived (some of this
may depend on patches to beangulp that I haven't committed to the public
repository yet):

- multiple importers matching the same source file, each extracting
transactions pertinent to one account. The transactions are grouped
per-account in the ledger and the source statement can be copied in
multiple locations in the archival folder

- single importer outputing transactions for multiple accounts. The
transactions are intermixed in the ledger and the source statement is
archived in only one place.

> My current solution is to input a dict of all expected accounts
> <https://github.com/pwalkr/beancount-utils/
> blob/54c118f4a4d6a706691fa3442db523b5253e3287/beancount_utils/importers/
> simplefin.py#L37>, but again is awkward for the self.account <https://
> github.com/pwalkr/beancount-utils/
> blob/54c118f4a4d6a706691fa3442db523b5253e3287/beancount_utils/importers/
> simplefin.py#L28> function (I don't actually use "archive" workflow) and > is making me update my out_of_place deduplicator <https://github.com/
> pwalkr/beancount-utils/blob/54c118f4a4d6a706691fa3442db523b5253e3287/
> beancount_utils/deduplicate.py#L6> which catches manually-created
> expenses on the wrong credit/debit card. It just doesn't isolate context
> and messes with the overall extract.

I don't understand what you mean and how beangulp could help solve it.

Cheers,
Dan

Chris Hasenpflug

unread,
Jan 10, 2025, 1:37:53 PMJan 10
to Beancount
Timely topic!  I've been playing around with SimpleFIN a bit as well and trying to get it integrated into my workflow.  I have the start of a python library and CLI that I'd like to share. Perhaps the snow day will give me an opportunity to polish it for publishing.

Paul Walker

unread,
Jan 10, 2025, 2:31:24 PMJan 10
to Beancount
On Friday, January 10, 2025 at 4:31:42 AM UTC-5 dan...@grinta.net wrote:
- multiple importers matching the same source file, each extracting
transactions pertinent to one account. The transactions are grouped
per-account in the ledger and the source statement can be copied in
multiple locations in the archival folder 

This makes a lot of sense! Must be a newer feature; not working for me but it's also been a minute since I've updated

* .../Downloads/simplefin.json  ERROR
    beancount_utils.importers.
simplefin.Importer
    beancount_utils.importers.
simplefin.Importer
  Document identified by more than one importer.

I don't use archive, but if archive as copy implies an additional "destroy" command to do actual cleanup, I would use that. I think 1:1 importer instance per account makes sense. Too, especially, for brokerage-type accounts where you may want more flexibility in PnL/income/fee postings (e.g. between retirement and non-retirement taxable accounts).

Though for fun, I'm working on a truly single-file multiple-account imports: Medical Claims. It's a mess/work in progress, but it imports the aggregate claims from my insurance and outputs transactions to per-provider payable accounts. This is much easier to reconcile than one big Liabilities:Medical:MyInsurance bucket without having to scrape every single MyChart (those in the US may know). Basically just provider -> payee and provider+patient -> account with simple re-match+dict mappings.

2024-10-12 * "Labcorp" "Paul"
  claim: "123456"
  patient: "PAUL"
  provider: "LABCORP HOLDINGS"
  Liabilities:Medical:Labcorp:Paul  -1.23 USD

^isn't at all 1:1 instance:account, but still works well with beangulp's deduplication and other paradigms.

Paul

Red S

unread,
Jan 11, 2025, 8:43:59 AMJan 11
to Beancount
But I'm not using this in production yet as I'm trying to write the equivalent of Red S's smart importer to predict the category account.

While I use smart_importer a lot and find it to be excellent what what it does, I had nothing to do with writing it. The authors and contributors can be found in the repo :).

BTW, curious, what aspect of it is not working for you? Seems to me like it should work out of the box for your use case.

David Avraamides

unread,
Jan 11, 2025, 12:56:16 PMJan 11
to Beancount
My bad - and apologies to the smart_importer contributors!

The issue I ran into is that the hook function call signature seems to have changed in beangulp v3. It's called with two args for the extracted and existing entries:

    # Invoke hooks.
    for func in ctx.hooks:
        extracted = func(extracted, existing_entries)

But smart_importer's EntryPredictor is expecting four args in __call__:

    def __call__(
            self,
            importer: Importer,
            file: str,
            imported_entries: data.Directives,
            existing_entries: data.Directives,
        ) -> data.Directives:

So I get an exception when trying to use it:

    TypeError: EntryPredictor.__call__() missing 2 required positional arguments: 'imported_entries' and 'existing_entries'

For now, I have derived my own class from PredictPostings and overridden __call__ to omit the importer and file arguments, but I just copied the implementation of __call__, which I don't like.

Chris Hasenpflug

unread,
Jan 11, 2025, 6:33:13 PMJan 11
to Beancount
Hi all, I have published my initial work on a SimpleFIN python library with command line interface.  It is available on pypi (https://pypi.org/project/simplefin/) and Github (https://github.com/chrishas35/simplefin-python/).

This initial release has commands to convert a SimpleFIN setup token into an access token (which you must securely store for future use). Subsequent commands look for the access token as an environment variable (personally I use direnv with .env files). You can use the CLI to get your SimpleFIN Account IDs and then run the transactions command to get a table or json output of that account's transactions. I plan to build and release a generic beangulp importer based on the json output in the future.

As this library is not beancount specific, feel free to open discussion or issues on the github repo.

-C

Max Tower

unread,
Jan 12, 2025, 2:04:30 AMJan 12
to Beancount

How does SimpleFin connect to accounts ? Does it "know" your passwords ?

Chris Hasenpflug

unread,
Jan 12, 2025, 2:32:19 AMJan 12
to Beancount
SimpleFIN Bridge uses MX, a Plaid-like service, to access your accounts.

Paul Walker

unread,
Jan 12, 2025, 3:10:36 AMJan 12
to Beancount
On Saturday, January 11, 2025 at 9:32:19 PM UTC-5 ch...@hasenpflug.net wrote:
SimpleFIN Bridge uses MX, a Plaid-like service, to access your accounts.

On Saturday, January 11, 2025 at 8:04:30 PM UTC-6 mtb...@gmail.com wrote:

How does SimpleFin connect to accounts ? Does it "know" your passwords ?

It does use MX per its Privacy and Security pages. MX keeps either your passwords or special app-specific keys depending on the security of your banks. I was initially suspicious too given the lack of development in the simplefin org, but it makes sense that it's largely an interface between MX APIs and the SimpleFIN spec; not much to change. For more confidence, SimpleFin had a recent third-party security audit performed by SecurityMetrics. It's also the backend for a buckets budgeting app by the same devs, and that appears to have an active community ~ extra visibility/scrutiny.
Reply all
Reply to author
Forward
0 new messages