Should I create links based on transaction IDs for duplicate detection?

23 views
Skip to first unread message

Artur Matos

unread,
Jan 29, 2026, 6:24:06 PM (2 days ago) Jan 29
to Beancount
Hi everyone,

Really basic question, but I'm a bit confused If I should be using transaction IDs as links, with the express purpose of filtering duplicates. Most of my financial data has stable, unique transaction IDs, so I can  easily output ^TRANSACTION1,  ^TRANSACTION2, etc.. per transaction.

Beangulp has same_link_operator which seems expressly designed for this purpose - use the links to find and filter out duplicate transactions upon importing.

But at the same time, I the beancount language syntax manual (https://beancount.github.io/docs/beancount_language_syntax.html#links) refers to links as a mechanism to group disparate transactions together, e.g. as part of the same invoice, and it doesn't mention anything about duplicate detection.

If I use transaction IDs as links I will end up with lots of different links (all unique) and I'm not sure if this has performance implications.

Am I overthinking it?

Thanks,

Artur


Martin Blais

unread,
Jan 29, 2026, 6:25:56 PM (2 days ago) Jan 29
to Beancount
They're equivalent semantically.
I use links for linking together a small number of related transactions and tags for larger groups of transactions e.g. a project, a trip, etc.

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beancount/3e7721b6-ed8d-4b6b-8574-ba52cef18a3an%40googlegroups.com.

Artur Matos

unread,
Jan 29, 2026, 8:33:22 PM (2 days ago) Jan 29
to Beancount
Thanks. But then from that perspective, having pretty much a different link per transaction is not really an issue performance-wise? Almost all of my transactions will end up having their own unique link not connecting to anything else. I guess alternatively I could have a 'transaction_id' metadata but using links seems easier and I can just rely on the same_link_operator to filter out duplicates when importing. 

Oscar Ale

unread,
Jan 30, 2026, 10:18:19 AM (15 hours ago) Jan 30
to Beancount

I use the metadata for this purpose, to deduplicate and reconcile transactions with my bank statements. If you use links it can get messy whenever you have multiple accounts you're keeping track of, for example if you transfer between two accounts would you have two links? Which link corresponds to which account? By using the metadata you ensure the id corresponds to the account in question, not the whole transaction. For example:

2026-01-02 * "Chick-fil-A" "Eating out"

  Expenses:Food  16.81 USD

  Assets:Bank:Checking  -16.81 USD
    id: "ab778fab39asdbf3bfsa"

2026-01-03 * "Bank" "Transfer to checking"
  Assets:Bank:Checking  50.00 USD

    id: "bnaer0s8832ba08df"

  Assets:Bank:Savings  -50.00 USD
    id: "bab01328bak092323"

It may be a little more verbose than the links but I think it's more in line with the intended purpose of links and metadata, and it should be pretty easy to implement with an importer. And if need be you can write a quick python script to check for duplicates against the transaction ids for each account.

 

"Artur Matos" arturm...@gmail.com – January 29, 2026 7:34 PM
 

Reply all
Reply to author
Forward
0 new messages