postgresql tuple as function argument

1,522 views
Skip to first unread message

Антонио Антуан

unread,
Apr 11, 2017, 9:04:54 AM4/11/17
to sqlalchemy
Hi
I want to build such query with sqlalchemy:
SELECT array_agg((column1, column2)) from table

Using psql it works perfectly and returns such result:
{"(col1_row1_value, col2_row1_value)", "(col1_row2_value, col2_row2_value)"...}

I tried several forms of SQLA-query:

>> from sqlalchemy.dialects.postgresql import ARRAY
>> from sqlalchemy.sql.elements import Tuple
>> ... func.array_agg(Tuple(Model.col1, Model.col2), type_=ARRAY(Unicode))...
...
TypeError: unhashable type: 'list'
>> ... func.array_agg(Tuple(Model.col1, Model.col2), type_=ARRAY(Unicode, as_tuple=True))...
...# returns value like this: ('{', '"', '(', 'c',...)
>> ... func.array_agg(Tuple(Model.col1, Model.col2), type_=ARRAY(Tuple, as_tuple=True))...
...
AttributeError: Neither 'Tuple' object nor 'Comparator' object has an attribute 'dialect_impl'

At first, I wanted to use `func.jsonb_object_agg(Model.col1, Model.col2)`, but it raises "unhashable type: dict"

Could you point to solution?

mike bayer

unread,
Apr 11, 2017, 9:43:20 AM4/11/17
to sqlal...@googlegroups.com
can't reproduce (though the ARRAY(unicode) type is not what psycopg2
returns, and there seems to be a difference in behavior between
sqlalchemy.ARRAY and sqlalchemy.dialects.postgresql.ARRAY).

please provide a complete example based on the below test script and
stack traces

also my PG database doesn't know about the jsonb_object_agg function

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()


class A(Base):
__tablename__ = 'a'
id = Column(Integer, primary_key=True)
x = Column(Unicode)
y = Column(Unicode)

e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
Base.metadata.drop_all(e)
Base.metadata.create_all(e)

s = Session(e)
s.add(A(x="x", y="y"))
s.commit()

row = s.query(func.array_agg(tuple_(A.x, A.y),
type_=ARRAY(Unicode))).scalar()
print row





On 04/11/2017 09:04 AM, Антонио Антуан wrote:
> Hi
> I want to build such query with sqlalchemy:
> |
> SELECT array_agg((column1,column2))fromtable
> |
>
> Using psql it works perfectly and returns such result:
> |
> {"(col1_row1_value, col2_row1_value)","(col1_row2_value,
> col2_row2_value)"...}
> |
>
> I tried several forms of SQLA-query:
>
> |>> from sqlalchemy.dialects.postgresql import ARRAY
>>> from sqlalchemy.sql.elements import Tuple
>>> ... func.array_agg(Tuple(Model.col1,Model.col2),type_=ARRAY(Unicode))...
> ...
> TypeError: unhashable type: 'list'
>>> ... func.array_agg(Tuple(Model.col1, Model.col2), type_=ARRAY(Unicode,
> as_tuple=True))...
> ...# returns value like this: ('{', '"', '(', 'c',...)
>>> ... func.array_agg(Tuple(Model.col1, Model.col2), type_=ARRAY(Tuple,
> as_tuple=True))...
> ...
> AttributeError: Neither 'Tuple' object nor 'Comparator' object has an
> attribute 'dialect_impl'
> |
>
> At first, I wanted to use /`func.jsonb_object_agg(Model.col1,
> Model.col2)`/, but it raises */"unhashable type: dict"/*
> */
> /*
> Could you point to solution?
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy+...@googlegroups.com>.
> To post to this group, send email to sqlal...@googlegroups.com
> <mailto:sqlal...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

mike bayer

unread,
Apr 11, 2017, 10:26:18 AM4/11/17
to sqlal...@googlegroups.com
I will say that the psycopg2 driver is not supporting this, however, it
isn't parsing out the tuple. Using postgresql.ARRAY we get an answer
like:

['{', '"', '(', 'x', ',', 'y', ')', '"', ',', '"', '(', 'x', ',', 'y',
')', '"', '}']

that is, the string coming back is being interpreted as an array by
SQLAlchemy, which here it is not. you might need to use some of the
psycopg2 extension points to support this like
http://initd.org/psycopg/docs/extensions.html#psycopg2.extensions.new_array_type.

Антонио Антуан

unread,
Apr 12, 2017, 9:19:54 AM4/12/17
to sqlalchemy
your imports can't provide ARRAY class, I've added: from sqlalchemy.dialects.postgresql import ARRAY. My version is 1.0.14.
Output:
/home/anton/Projects/.venv/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:185: SAWarning: Unicode type received non-unicode bind param value 'y'. (this warning may be suppressed after 10 occurrences)
  (util.ellipses_string(value),))
/home/anton/Projects/.venv/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:185: SAWarning: Unicode type received non-unicode bind param value 'x'. (this warning may be suppressed after 10 occurrences)
  (util.ellipses_string(value),))
['{', '"', '(', 'x', ',', 'y', ')', '"', '}']


And about jsonb_object_agg. I use PostgreSQL 9.5. If you use older version, you can try to replace it with json_object_agg.
I wrote such code:



from sqlalchemy import *
from sqlalchemy.dialects.postgresql import ARRAY

from hasoffers.core.model import Base
from hasoffers.core.model import Session



class A(Base):
__tablename__ = 'a' 
id = Column(Integer, primary_key=True, autoincrement=True)

x = Column(Unicode) 
y = Column(Unicode)


class B(Base):
__tablename__ = 'b'
id = Column(Integer, primary_key=True, autoincrement=True)
a_keys = Column(ARRAY(Integer))
col1 = Column(Unicode)
col2 = Column(Unicode)

Base.metadata.bind = Session.bind
Base.metadata.create_all()

Session.add(A(x="x", y="y"))
Session.add(A(x="f", y="j"))
Session.add(A(x="b", y="s"))
Session.add(B(a_keys=[1], col1="qaz", col2="qwe"))
Session.add(B(a_keys=[2,3], col1="zaq", col2="fds"))
Session.add(B(a_keys=[2,3], col1="gtr", col2="ascs"))
Session.commit()

for row in Session.query(A, func.jsonb_object_agg(B.col1, B.col2)).join(B, A.id == func.any(B.a_keys)).group_by(A.id):

print row


After execution I got such traceback:

Traceback (most recent call last):
  File "/home/anton/Projects/.../core/hasoffers/core/run/stuff.py", line 33, in <module>
    for row in Session.query(A, func.jsonb_object_agg(B.col1, B.col2)).join(B, A.id == func.any(B.a_keys)).group_by(A.id):
  File "/home/anton/Projects/.venv/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 86, in instances
    util.raise_from_cause(err)
  File "/home/anton/Projects/.venv/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/home/anton/Projects/.venv/lib/python2.7/site-packages/sqlalchemy/orm/loading.py", line 77, in instances
    rows = util.unique_list(rows, filter_fn)
  File "/home/anton/Projects/.venv/lib/python2.7/site-packages/sqlalchemy/util/_collections.py", line 757, in unique_list
    if hashfunc(x) not in seen
TypeError: unhashable type: 'dict'




вторник, 11 апреля 2017 г., 16:43:20 UTC+3 пользователь Mike Bayer написал:

Антонио Антуан

unread,
Apr 12, 2017, 9:20:38 AM4/12/17
to sqlalchemy
Ok, I'l try it, thank you.

вторник, 11 апреля 2017 г., 17:26:18 UTC+3 пользователь Mike Bayer написал:

mike bayer

unread,
Apr 12, 2017, 9:36:34 AM4/12/17
to sqlal...@googlegroups.com


On 04/12/2017 09:19 AM, Антонио Антуан wrote:
> your imports can't provide ARRAY class, I've added: from
> sqlalchemy.dialects.postgresql import ARRAY. My version is 1.0.14.

for ARRAY + array_agg, explicit support for these things has improved in
the 1.1 series, and if you're doing this kind of stuff I'd recommend
moving to 1.1 so you are at least in better shape to get to 1.2, 1.3,
etc. as we continue to adjust the API to improve upon these things,
there's lots of related Postgresql syntaxes we are not yet on track to
explicitly support until at least 1.3:

http://docs.sqlalchemy.org/en/rel_1_1/changelog/migration_11.html#new-function-features-within-group-array-agg-and-set-aggregate-functions




> Output:
> |
> /home/anton/Projects/.venv/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:185:
> SAWarning: Unicode type received non-unicode bind param value 'y'. (this
> warning may be suppressed after 10 occurrences)
> (util.ellipses_string(value),))
> /home/anton/Projects/.venv/lib/python2.7/site-packages/sqlalchemy/sql/sqltypes.py:185:
> SAWarning: Unicode type received non-unicode bind param value 'x'. (this
> warning may be suppressed after 10 occurrences)
> (util.ellipses_string(value),))
> ['{', '"', '(', 'x', ',', 'y', ')', '"', '}']
> |


yes that is because psycopg2 is not recognizing the type of data.

However, we can improve upon the situation by adding a CAST, which seems
to send psycopg2 a little more information:

row = s.query(
cast(
func.array_agg(tuple_(A.x, A.y)), ARRAY(Unicode)
)
).scalar()
print row

then we get back:

[u'(x,y)']

which... is what you wanted? not sure. I think overall if you can
provide a CAST to the type you want, it will be passed to the psycopg2
driver so that it has a clue what it should do with the string data its
getting back.
> > an email to sqlalchemy+...@googlegroups.com <javascript:>
> > <mailto:sqlalchemy+...@googlegroups.com <javascript:>>.
> > To post to this group, send email to sqlal...@googlegroups.com
> <javascript:>
> > <mailto:sqlal...@googlegroups.com <javascript:>>.
> <https://groups.google.com/group/sqlalchemy>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy+...@googlegroups.com>.

Lele Gaifax

unread,
Apr 14, 2017, 3:50:15 AM4/14/17
to sqlal...@googlegroups.com
mike bayer <mik...@zzzcomputing.com> writes:

> However, we can improve upon the situation by adding a CAST, which seems to
> send psycopg2 a little more information:
>
> row = s.query(
> cast(
> func.array_agg(tuple_(A.x, A.y)), ARRAY(Unicode)
> )
> ).scalar()
> print row

For one new application I'm writing I chose the "new" asyncpg[1] driver.

The major problem I hit was related to this CASTing need ([2] and [3]): due to
the way asyncpg talks to the database, in some cases it cannot guess the right
format because on the PG side things may be ambiguous (e.g. when an operator
is overloaded and works on different types of data, like the @> operator I
used in [3]).

To solve that I wrote a very simple custom PGDialect+PGCompiler duo to handle
a new "paramstyle", basically translating the following SA construct::

oneyearago = date.today() - timedelta(days=365)
q = sa.select([users.c.id]) \
.where(users.c.name == 'test') \
.where(users.c.validity.contains(oneyearago))

into this SQL statement::

SELECT auth.users.id
FROM auth.users
WHERE auth.users.name = $1::VARCHAR(64) AND (auth.users.validity @> $2::DATE)

Since it eventually boils down to overriding a single "private" method
(_apply_numbered_params), I wonder if it would be a welcome addition to
standard SA, maybe allowing to set "paramstyle" to an user's arbitrary
callable.

What do you think?

Thank you,
ciao, lele.

[1] https://github.com/MagicStack/asyncpg
[2] https://github.com/CanopyTax/asyncpgsa/issues/12
[3] https://github.com/MagicStack/asyncpg/issues/32
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
le...@metapensiero.it | -- Fortunato Depero, 1929.

mike bayer

unread,
Apr 14, 2017, 9:27:03 AM4/14/17
to sqlal...@googlegroups.com
I'd note that the above seems to *not* be casting "users.id" in the
SELECT. The problem we are having with the array/json/tuple etc. is
that the typing information is not in the result description, so that's
really the only place a CAST is sometimes needed when dealing with
psycopg2.

I'm not sure what asyncpgs' issue is with input types, there should not
be a reason to parse the SQL statement and look at operators, you have
the Python type of object coming in, you use that to determine the
general behavior to take with basic types like strings and dates.

Additionally, the Python DBAPI already supports a method for dealing
with the general problem of "what types should be applied to bound
parameters?", it's called setinputsizes:
https://www.python.org/dev/peps/pep-0249/#id26 . SQLAlchemy has
generalized support for this for DBAPIs that require it, which currently
includes cx_Oracle.

but that's pep-249, which asyncpg unfortunately simply disregards.

So, what you have above is very easy to accomplish with a simple
@compiles recipe that merely adds CAST to the rendering of bindparam():


from sqlalchemy import *
from sqlalchemy.orm import *
from datetime import date, timedelta
from sqlalchemy.sql.elements import BindParameter
from sqlalchemy.ext.compiler import compiles


@compiles(BindParameter)
def inject_cast(element, compiler, _already_wrapped=False, **kw):
if _already_wrapped:
return compiler.visit_bindparam(element, **kw)
else:
return compiler.process(
cast(element, element.type), _already_wrapped=True, **kw)


m = MetaData()

users = Table(
'users', m,
Column('id', Integer, primary_key=True),
Column('name', String(64)),
Column('validity', DateTime)
)


oneyearago = date.today() - timedelta(days=365)
q = select([users.c.id]) \
.where(users.c.name == 'test') \
.where(users.c.validity.contains(oneyearago))


print q


SELECT users.id
FROM users
WHERE users.name = CAST(:name_1 AS VARCHAR(64)) AND (users.validity LIKE
'%' + CAST(:validity_1 AS DATE) || '%')

Lele Gaifax

unread,
Apr 14, 2017, 11:02:39 AM4/14/17
to sqlal...@googlegroups.com
mike bayer <mik...@zzzcomputing.com> writes:

> I'm not sure what asyncpgs' issue is with input types, there should not be a
> reason to parse the SQL statement and look at operators, you have the Python
> type of object coming in, you use that to determine the general behavior to
> take with basic types like strings and dates.

asyncpg does not parse the SQL statement, and it tries hard to not have to
parse incoming data from the database, even, leaving all that to the PG
engine.

In my example, the column contains a daterange value, and the query I tried
used the @> operator (what the SA .contains() on such type renders to): but as
said in https://github.com/MagicStack/asyncpg/issues/32#issuecomment-254208760
PG has not enough data (you can read that as "has a bug" if you prefer) to
guess the right type, as the @> implements both "daterange contains scalar
date" and "daterange is a superset of another daterange". Explicitly casting
the input value solves the ambiguity.

> So, what you have above is very easy to accomplish with a simple @compiles recipe that merely
> adds CAST to the rendering of bindparam():

Yes, I know, but since I need to tweak the representation of positional
parameters (that is, $1 instead of :1), I had to touch the BIND_TEMPLATES
content. That's why I thought a more generic approach of allowing an
arbitrary, user controlled way to render the placeholders could be better.

ciao, lele.

Mike Bayer

unread,
Apr 14, 2017, 11:11:54 AM4/14/17
to sqlal...@googlegroups.com
Can you show me the model and some sample data ?  Is this the JSONB datatype ?

--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example.  See  http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscribe@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.

Lele Gaifax

unread,
Apr 14, 2017, 11:40:10 AM4/14/17
to sqlal...@googlegroups.com
Mike Bayer <mik...@zzzcomputing.com> writes:

> Can you show me the model and some sample data ? Is this the JSONB
> datatype ?

Sure, here it is:

import datetime

from sqlalchemy import create_engine, select, Table, Column, Integer, UnicodeText, MetaData
import sqlalchemy.dialects.postgresql as sapg

metadata = MetaData()
users = Table('users', metadata,
Column('id', Integer(), primary_key=True),
Column('name', UnicodeText()),
Column('active', sapg.DATERANGE()),
)

e = create_engine('postgresql://localhost/test')

metadata.create_all(e)

i = users.insert().values(name='myself', active='[2016-10-21,2020-10-22)')

e.execute(i)

q = select([users.c.id]) \
.where(users.c.name == 'myself') \
.where(users.c.active.contains(datetime.date.today()))

print(e.execute(q).fetchall())

This works as is with psycopg2 (maybe it was not clear that I was suggesting
an enhancement, not a fix :-).

Mike Bayer

unread,
Apr 14, 2017, 11:48:11 AM4/14/17
to sqlal...@googlegroups.com


On Apr 14, 2017 11:40 AM, "Lele Gaifax" <le...@metapensiero.it> wrote:
Mike Bayer <mik...@zzzcomputing.com> writes:

> Can you show me the model and some sample data ?  Is this the JSONB
> datatype ?

Sure, here it is:

  import datetime

  from sqlalchemy import create_engine, select, Table, Column, Integer, UnicodeText, MetaData
  import sqlalchemy.dialects.postgresql as sapg

  metadata = MetaData()
  users = Table('users', metadata,
      Column('id', Integer(), primary_key=True),
      Column('name', UnicodeText()),
      Column('active', sapg.DATERANGE()),
  )

  e = create_engine('postgresql://localhost/test')

  metadata.create_all(e)

  i = users.insert().values(name='myself', active='[2016-10-21,2020-10-22)')

  e.execute(i)

  q = select([users.c.id]) \
      .where(users.c.name == 'myself') \
      .where(users.c.active.contains(datetime.date.today()))

  print(e.execute(q).fetchall())

This works as is with psycopg2 (maybe it was not clear that I was suggesting
an enhancement, not a fix :-).

I know that, however (not at my computer at the moment) how does this case work w psycopg2 ?    




ciao, lele.
--
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
le...@metapensiero.it  |                 -- Fortunato Depero, 1929.

Lele Gaifax

unread,
Apr 14, 2017, 12:45:29 PM4/14/17
to sqlal...@googlegroups.com
Mike Bayer <cla...@zzzcomputing.com> writes:

> I know that, however (not at my computer at the moment) how does this case
> work w psycopg2 ?

I think that psycopg2 does that on its own, that is, using a debug build, I
can see that the effective statement it sends to the engine is

SELECT users.id
FROM users
WHERE users.name = 'myself' AND (users.active @> '2017-04-14'::date)

Mike Bayer

unread,
Apr 14, 2017, 1:49:55 PM4/14/17
to sqlal...@googlegroups.com
Right.  So why can't asyncpg do that.

Lele Gaifax

unread,
Apr 14, 2017, 4:01:43 PM4/14/17
to sqlal...@googlegroups.com
Mike Bayer <cla...@zzzcomputing.com> writes:

> Right. So why can't asyncpg do that.

I bet because it does want to fiddle with the SQL statement parsing, or with
the formatting of parameter's values.

Lele Gaifax

unread,
Apr 14, 2017, 4:31:18 PM4/14/17
to sqlal...@googlegroups.com
Lele Gaifax <le...@metapensiero.it> writes:

> I bet because it does want to fiddle with the SQL statement parsing, or with
> the formatting of parameter's values.

ops: ... it does *not* want...

mike bayer

unread,
Apr 14, 2017, 5:58:33 PM4/14/17
to sqlal...@googlegroups.com
yeah IMO this is a bug in asyncpg but of course your custom dialect can
work around that.

Lele Gaifax

unread,
Apr 14, 2017, 6:03:51 PM4/14/17
to sqlal...@googlegroups.com
mike bayer <mik...@zzzcomputing.com> writes:

> yeah IMO this is a bug in asyncpg but of course your custom dialect can work
> around that.

Ok, thank you.
Reply all
Reply to author
Forward
0 new messages