Postgres array containment query generates FROM-clause that causes failure

46 views
Skip to first unread message

Elmer de Looff

unread,
Aug 8, 2019, 9:48:25 AM8/8/19
to sqlalchemy
Hi,

I'm trying to create a query to check whether a small number of given keys are all present within a selection of a table. Postgres provides array types/functions for this to check sub/superset properties, which seem to do what I want. The query I'm trying to create is one of the following form:

SELECT :selection <@ ARRAY(
    SELECT id
    FROM city
    WHERE size_code = :size_code)

Wrapping this in a text clause, adding parameters and executing it works without a hitch:

raw_select = sa.text("""
    SELECT :selection <@ ARRAY(
        SELECT id
        FROM city
        WHERE size_code = :size_code)""")
parameterized = raw_select.params(
    selection=[3, 10, 18],
    size_code='M')
result = engine.execute(parameterized).scalar()

However, I'd like to avoid having textual SQL in my codebase as it's more sensitive to changes in names and generally more error-prone. I'm struggling converting this to a working Core expression, a spurious FROM-clause keeps being generated:

city_ids = sa.select([City.id]).where(City.size_code == 'M')
check = sa.select([
    array([3, 10, 18]).contained_by(sa.func.array(city_ids))])
engine.execute(check).scalar()

This results in a Syntax Error being thrown by Postgres:

sqlalchemy.exc.ProgrammingError: (psycopg2.errors.SyntaxError) subquery in FROM must have an alias
LINE 4: FROM (SELECT city.id AS id 
             ^
HINT:  For example, FROM (SELECT ...) [AS] foo.

[SQL: SELECT ARRAY[%(param_1)s, %(param_2)s, %(param_3)s] <@ array((SELECT city.id 
FROM city 
WHERE city.size_code = %(size_code_1)s)) AS anon_1 
FROM (SELECT city.id AS id 
FROM city 
WHERE city.size_code = %(size_code_1)s)]
[parameters: {'param_1': 3, 'param_2': 10, 'param_3': 18, 'size_code_1': 'M'}]
(Background on this error at: http://sqlalche.me/e/f405)

The problem appears to be in the "_froms" list that is non-empty on the "check" query, but I can't seem to find a way of coercing SQLAlchemy into not generating that.

I've attached a minimal script to reproduce the problem. The table is described though will have to be created still; it need not contain any data, the problem is one of SQL syntax alone.
sqla_select_contained_by.py

Mike Bayer

unread,
Aug 8, 2019, 11:17:18 AM8/8/19
to noreply-spamdigest via sqlalchemy


On Thu, Aug 8, 2019, at 9:48 AM, Elmer de Looff wrote:
Hi,

I'm trying to create a query to check whether a small number of given keys are all present within a selection of a table. Postgres provides array types/functions for this to check sub/superset properties, which seem to do what I want. The query I'm trying to create is one of the following form:


thanks for the clear test case, which allows me to just make your code work. 

The ARRAY(select) is against a scalar list, so use as_scalar() so that the SELECT becomes a self-contained subquery:

    city_ids = sa.select([City.id]).where(City.size_code == "M").as_scalar()
--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.


Attachments:
  • sqla_select_contained_by.py

Elmer de Looff

unread,
Aug 8, 2019, 11:56:20 AM8/8/19
to sqlal...@googlegroups.com
Thanks for the quick response and solution!

Having the answer, it seems obvious enough, but getting to the solution from the problem was hard, despite excellent documentation, Not sure it deserves an FAQ entry as it might be a bit of an infrequent problem, but maybe that'll help the next person. I'd be happy to take a stab at an entry if you're interested.



--

Elmer

Mike Bayer

unread,
Aug 8, 2019, 1:05:06 PM8/8/19
to noreply-spamdigest via sqlalchemy


On Thu, Aug 8, 2019, at 11:56 AM, Elmer de Looff wrote:
Thanks for the quick response and solution!

Having the answer, it seems obvious enough, but getting to the solution from the problem was hard, despite excellent documentation, Not sure it deserves an FAQ entry as it might be a bit of an infrequent problem, but maybe that'll help the next person. I'd be happy to take a stab at an entry if you're interested.

PostgreSQL's special syntaxes are a much bigger issue than just this.     Many syntaxes are not yet supported and require recipes as listed at https://github.com/sqlalchemy/sqlalchemy/issues/3566 .      I would think the Postgresql dialect documentation needs an entire section dedicated to special PG patterns, preferably in tandem with the structures in issue 3566 being implemented as features.

also, the approach I just gave you with as_scalar() might not be generalizable as of yet , in that it's not clear what the approach is for multidimensional arrays - as_scalar() really means a SELECT that returns a single column / single row, and already we are somewhat repurposing it here to represent a SELECT that represents multiple rows.    This usage should be clarified.

Additionally, the use of func.array() works in this case but ideally you'd be using the sqlalchemy.dialects.postgresql.array() construct since this is not as much a SQL function as it is a first class datastructure in Postgresql.      Multidimensional support was added to this construct in https://github.com/sqlalchemy/sqlalchemy/issues/4756 but I don't think it as of yet supports arbitrary SQL expressions, which it should.

Basically, this whole area of SQLAlchemy is raw and under-developed.   If we document approaches that just happen to work right now, but aren't tested or supported or part of an overall strategy, then we are creating poor assumptions.     So I'd prefer we build out real patterns and have them tested before we document them.




Elmer de Looff

unread,
Aug 9, 2019, 5:30:15 AM8/9/19
to sqlal...@googlegroups.com
I've been milling over your comment regarding the func.array vs array approaches, and that makes a lot of sense. Currently any array(iterable) statement gets turned into an ARRAY[] literal in Postgres. Extending this to emit an ARRAY() constructor when array(query) is called seems like a reasonable approach. Turning this expression into a scalar one where necessary could be something that is done as part of that process. Going by Postgres documentation, this query should return a single column, but it seems reasonable to have the user be responsible for honoring that limitation and not SQLAlchemy.

Having array(expression) also result in working contains/contained_by methods would be a nice bonus, as these don't work for the func.array() approach.

I took a brief look at the code for array, but I'm not really sure how the whole flow from Python statement to emitted SQL goes. It does seems fairly wedded to the concept of being a literal array (looking at the superclass) for now, so my suggested extension might be a bit easier said than implemented.



--

Elmer

Elmer de Looff

unread,
Aug 12, 2019, 5:45:29 AM8/12/19
to sqlal...@googlegroups.com
Hi Mike,

I've been playing around with the Postgresql dialect array() type a bit more and have something that appears to work, at least for the small case that I'm trying to solve for myself. I'd like to check whether I'm on the right track with how I'm approaching this though so please find a patch (against the current master) attached, as well as a small update to the example script to use the array(selectable).contained_by(array(literal)) syntax I described in my previous message.

One thing that surprised me when doing this against the master branch was an error that Subquery no longer has an .as_scalar() method; but then removing as_scalar() altogether from that query yielded a correct query, so something (excellent) has changed there in 1.4
--

Elmer
array_contained_by_check.py
array-construction-from-selectable.patch

Mike Bayer

unread,
Aug 12, 2019, 9:36:22 AM8/12/19
to noreply-spamdigest via sqlalchemy
the changes set up for the 1.4 series are *extremely* significant, and are intended to push us towards a "SQLAlchemy 2.0" concept that will more clearly embrace the future of Python which includes Py3 only as well as static typing.

I am still holding off on completely blogging all of this because I'm still stuck in some architectural quagmires I'm trying to work through.  

anyway this contributes towards the notion that I'm trying to get the 1.4 / 2.0 concept fully realized before we get into many new and intricate PG syntaxes that at least in issue 3566 involve heavy FROM clause manipulation.


Attachments:
  • array_contained_by_check.py
  • array-construction-from-selectable.patch

Reply all
Reply to author
Forward
0 new messages