Using Postgresql's crosstab tablefunc in SQLAlchemy

631 views
Skip to first unread message

Mali Akmanalp

unread,
Aug 24, 2012, 1:52:32 PM8/24/12
to sqlal...@googlegroups.com
Hi All,

I've been struggling recently to get postgres crosstabs [0] to work with SQLAlchemy. The interesting thing about these is that they require you to pass in a *string of a query* into the function, rather than an actual SQL expression, like so:

select * from crosstab('select column, row, count(foo) as 'value' from derp');
 

The closest thing I could come up with to this was to use the subquery function, doing:

sq = session.query(column, row, value).subquery()

for the inner part and then placing that into the outer func.crosstab(). But of course I don't want to manually turn the subquery into a string by putting it into quotes myself since there probably are a billion edge cases I could miss and there probably is a better way.

Any ideas?

Cheers,
~mali

Michael Bayer

unread,
Aug 24, 2012, 3:58:16 PM8/24/12
to sqlal...@googlegroups.com
On Aug 24, 2012, at 1:52 PM, Mali Akmanalp wrote:

Hi All,

I've been struggling recently to get postgres crosstabs [0] to work with SQLAlchemy. The interesting thing about these is that they require you to pass in a *string of a query* into the function, rather than an actual SQL expression, like so:

select * from crosstab('select column, row, count(foo) as 'value' from derp');
 

The closest thing I could come up with to this was to use the subquery function, doing:

sq = session.query(column, row, value).subquery()

for the inner part and then placing that into the outer func.crosstab(). But of course I don't want to manually turn the subquery into a string by putting it into quotes myself since there probably are a billion edge cases I could miss and there probably is a better way.


this is a new (and extremely awkward, wow) kind of SQL compilation arrangement.  Two ways to do it:

1. just use a string.  connection/session.execute("select * from ...").  this syntax is totally specific to Postgresql in any case.  It's unlikely you need lots of fluency here as the construct is so awkward to start with.

2. create a custom construct; see the docs at http://docs.sqlalchemy.org/en/rel_0_7/core/compiler.html.   Here is a prototype, keeping in mind I'm not deeply familiar with this construct, I'm just showing some techniques you can use to have the construction of it be fairly automated.  I'm assuming that bound parameters are out of the question, so I have a function "crosstab_param" here - you can also use bindparam() if you want bound values in the embedded SELECT:

from sqlalchemy.sql import ColumnElement, FromClause, column, literal_column
from sqlalchemy.ext.compiler import compiles

"""
SELECT *
FROM crosstab(
  'select rowid, attribute, value
   from ct
   where attribute = ''att2'' or attribute = ''att3''
   order by 1,2')
AS ct(row_name text, category_1 text, category_2 text, category_3 text);

"""

class crosstab(FromClause):
    def __init__(self, name, stmt, columns):
        self.name = name
        self.stmt = stmt
        self.columns = columns

    def _populate_column_collection(self):
        self._columns.update(
            column(name, type=type_)
            for name, type_ in self.names
        )

def crosstab_param(value):
    # we can't use bound parameters in crosstab?
    return literal_column("''%s''" % value)

@compiles(crosstab)
def visit_element(element, compiler, **kw):
    return """
        crosstab('%s order by 1, 2') AS %s(%s)
    """ % (
            compiler.visit_select(element.stmt),
            element.name,
            ", ".join(
                "%s %s" % (c.name, compiler.visit_typeclause(c))
                for c in element.c
            )
        )

from sqlalchemy import Table, Column, MetaData, Text, or_, select
m = MetaData()

ct = Table('ct', m, Column('rowid', Text),
                Column('attribute', Text),
                Column('value', Text))

stmt = select([ct]).where(
            or_(
                ct.c.attribute == crosstab_param('att2'),
                ct.c.attribute == crosstab_param('att3')
            )
        )
print select(['*']).select_from(crosstab("ct", stmt, ct.c))



Mali Akmanalp

unread,
Aug 29, 2012, 3:12:39 PM8/29/12
to sqlal...@googlegroups.com
Hey Michael,

First of all, thanks for the prompt answer and the pointer to the right direction!

I fiddled around with this, fixed and packaged it up, documented it, and created something hopefully useful to others:
https://github.com/makmanalp/sqlalchemy-crosstab-postgresql

- I found the $$ operator which removes the need for the awkward crosstab_param. This is useful because I have query bits and pieces that I already compose as functions (eg: get_year = lambda col: cast(func.date_part('year', func.to_timestamp(col)), Integer) etc) and I don't want to rewrite those using crosstab_param again.

- I support the second form of crosstab that takes another query that defines possible columns, which IMHO is cleaner.

- I auto-add the required order_bys

Thanks again and hope this helps in the future.

Cheers,
~mali

--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To post to this group, send email to sqlal...@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.

Reply all
Reply to author
Forward
0 new messages