Filtering accounts by a subquery

100 views
Skip to first unread message

Daniel Schultz

unread,
Oct 24, 2024, 1:07:46 PM10/24/24
to Beancount
Hello all,

I'm generating a campaign finance report, which means I need to load any journal entries for accounts whose *total balance* exceeds $50.00

I figure this is not going to be possible via BQL directly but figured it cannot hurt to ask.

The query I would love to use is something along the lines of:

```
  SELECT
    account,
    date,
    position,
    getitem(open_meta(account), 'pac_name') AS pac_name,
    getitem(open_meta(account), 'address') AS address,
    getitem(open_meta(account), 'city') AS city,
    getitem(open_meta(account), 'state') AS state,
    getitem(open_meta(account), 'zip') AS zip,
    getitem(open_meta(account), 'country') AS country,
    getitem(open_meta(account), 'email') AS email,
    getitem(open_meta(account), 'phone') AS phone
  WHERE account ~ 'Income:Contributions'
  AND getitem(open_meta(account), 'pac_name') != ''
  AND account IN (
    SELECT account
    WHERE abs(sum(position)) > 50
  )
  ORDER BY date
```

This doesn't appear to work and I believe that is because subqueries are not actually a thing in BQL.  Are there any query tools that might help be achieve this goal?

Best,
 - Dan

Brian Lalor

unread,
Oct 26, 2024, 10:46:48 AM10/26/24
to bean...@googlegroups.com
What do these transactions look like in beancount syntax?

— 
Brian Lalor (he/him)

--
You received this message because you are subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beancount+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beancount/1a6a706f-c95c-4e5b-a607-5c94c3d43325n%40googlegroups.com.

Daniele Nicolodi

unread,
Oct 27, 2024, 1:02:16 PM10/27/24
to bean...@googlegroups.com
beanquery has some very limited support for subqueries. In particular,
it does not support subqueries that are not constants. However, it
didn't had support for subqueries on the right hand side of the IN
operator. Adding minimal support for it is relatively easy, so I did it,
see https://github.com/beancount/beanquery/pull/208. Expanding the
subquery support is in the works.

However, I guess that your attempt failed at an earlier stage of the
query interpretation: the subquery is invalid.

SELECT account WHERE abs(sum(position)) > 50

Because of the use of sum() this is aggregate query (the fact is
disguised by the use of the BQL auto-aggregate functionality) and it is
trying to filter the rows by the value of the aggregated column. This is
not a valid operation in SQL. The query would have to be rewritten in
this way:

SELECT account GROUP BY account HAVING abs(sum(position)) > 50

However, the condition for the HAVING clause is incorrect: sum(position)
returns an inventory, and the abs() of an inventory is still an
inventory, which cannot be compared with a number. You would need to
extract the amount of a single currency in the inventory, and get the
numerical part of the amount and discard the currency. I also thing the
use of abs() is not useful, as you are interested only in donors, thus
the inventory should always be positive. Therefore, the subquery becomes:

SELECT account
GROUP BY account
HAVING number(only('USD', sum(position))) > 50.0

However, this is computing a lot of inventories that are not really
useful for the outer query and making the set to be check by the IN
operator unnecessarily large. Adding a WHERE clause helps:

SELECT account
WHERE root(account, 2) = 'Income:Contributions'
GROUP BY account
HAVING number(only('USD', sum(position))) > 50.0

and you can drop the equivalent condition from the outer query.

Cheers,
Dan

Reply all
Reply to author
Forward
0 new messages