Mixing ORM and set-returning-functions (e.g. generate_series)

692 views
Skip to first unread message

Daniel Grace

unread,
Nov 7, 2013, 6:46:45 PM11/7/13
to sqlal...@googlegroups.com
I've been using the ORM exclusively for my project and I've run into a particular case that has me stumped:

I'm trying to find a way to generate SQL that essentially looks like this, using session.query() as a base:  (Running on PostgreSQL)

SELECT model.*, series.number
FROM model, generate_series(1,100) AS series(number)
WHERE some_where_clauses AND series.number some_condition

The closest solutions I've found all have one or more of the following problems:
  • They non-ORM methods (e.g. select() instead of query() and a lot of text())
  • They end up wrapping generate_series in a subselect -- SELECT .... FROM (SELECT generate_series(...)).  This is technically legal(*) 
  • They don't give me a way to properly alias both the "table" and its columns (in the case of session.query(...).select_from(sql.func.generate_series(...)))
Any ideas what I should be doing here?  FWIW, it seems like this similar situation could potentially crop up if using a RDBMS that supported parameterized views (Postgres doesn't, yet) -- though at least that situation would (probably) know what the output columns are from reflection.

-- Daniel

Michael Bayer

unread,
Nov 7, 2013, 9:18:57 PM11/7/13
to sqlal...@googlegroups.com
On Nov 7, 2013, at 6:46 PM, Daniel Grace <thisgen...@gmail.com> wrote:

I've been using the ORM exclusively for my project and I've run into a particular case that has me stumped:

I'm trying to find a way to generate SQL that essentially looks like this, using session.query() as a base:  (Running on PostgreSQL)

SELECT model.*, series.number
FROM model, generate_series(1,100) AS series(number)
WHERE some_where_clauses AND series.number some_condition

Ok well selecting from a function as a FROM clause is something that you need to set up with a special select() statement (see http://docs.sqlalchemy.org/en/rel_0_9/core/tutorial.html#functions), though that syntax you have there where you’re adding in a “column definition list” ( a term I just read at http://www.postgresql.org/docs/9.0/static/sql-select.html, though what you have there doesn’t seem to look the same) isn’t really straightforward right now.

If i understand what that actually means, then you’d get that effect, using an extra subquery, like this:

from sqlalchemy import Column, Integer, select, func
from sqlalchemy.orm import Session
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.sql import column

Base = declarative_base()

class Model(Base):
    __tablename__ = 'model'
    id = Column(Integer, primary_key=True)

series = select([column('number')]).select_from(func.generate_series(1, 100))

sess = Session()
q = sess.query(Model, series.c.number).filter(Model.id > series.c.number)

print q


output:

SELECT model.id AS model_id, number AS number 
FROM model, (SELECT number 
FROM generate_series(:generate_series_1, :generate_series_2)) 
WHERE model.id > number


that’s basically equivalent.   If you want the identical syntax with that function, the @compiles construct would be a start, but you’d be probably making a custom FromClause subclass which is a little involved.










The closest solutions I've found all have one or more of the following problems:
  • They non-ORM methods (e.g. select() instead of query() and a lot of text())
  • They end up wrapping generate_series in a subselect -- SELECT .... FROM (SELECT generate_series(...)).  This is technically legal(*) 
  • They don't give me a way to properly alias both the "table" and its columns (in the case of session.query(...).select_from(sql.func.generate_series(...)))
Any ideas what I should be doing here?  FWIW, it seems like this similar situation could potentially crop up if using a RDBMS that supported parameterized views (Postgres doesn't, yet) -- though at least that situation would (probably) know what the output columns are from reflection.

-- Daniel

--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.

signature.asc

Michael Bayer

unread,
Nov 7, 2013, 9:22:02 PM11/7/13
to sqlal...@googlegroups.com

On Nov 7, 2013, at 9:18 PM, Michael Bayer <mik...@zzzcomputing.com> wrote:

>
> If i understand what that actually means, then you’d get that effect, using an extra subquery, like this:
>
> from sqlalchemy import Column, Integer, select, func
> from sqlalchemy.orm import Session
> from sqlalchemy.ext.declarative import declarative_base
> from sqlalchemy.sql import column
>
> Base = declarative_base()
>
> class Model(Base):
> __tablename__ = 'model'
> id = Column(Integer, primary_key=True)
>
> series = select([column('number')]).select_from(func.generate_series(1, 100))
>
> sess = Session()
> q = sess.query(Model, series.c.number).filter(Model.id > series.c.number)
>
> print q
>
>
> output:
>
> SELECT model.id AS model_id, number AS number
> FROM model, (SELECT number
> FROM generate_series(:generate_series_1, :generate_series_2))
> WHERE model.id > number

err plus alias:

series = select([column('number')]).select_from(func.generate_series(1, 100)).alias('series')

sess = Session()
q = sess.query(Model, series.c.number).filter(Model.id > series.c.number)


SELECT model.id AS model_id, series.number AS series_number
FROM model, (SELECT number
FROM generate_series(:generate_series_1, :generate_series_2)) AS series
WHERE model.id > series.number
signature.asc

Daniel Grace

unread,
Nov 13, 2013, 6:12:32 PM11/13/13
to sqlal...@googlegroups.com
I know it's been a few days, but I've finally had time to actually get back to this


On Thursday, November 7, 2013 6:18:57 PM UTC-8, Michael Bayer wrote:
On Nov 7, 2013, at 6:46 PM, Daniel Grace <thisgen...@gmail.com> wrote:
[...]
that’s basically equivalent.   If you want the identical syntax with that function, the @compiles construct would be a start, but you’d be probably making a custom FromClause subclass which is a little involved.

I'd think that a generic form of this might actually be a way to go, but I'm not (yet) familiar enough with sqlalchemy's internals to make a lot of headway in designing any 'sane' solution.

It's worth noting PostgreSQL supports column-level aliasing on anything, not just a set-returning-function, such that the following is valid:

CREATE TABLE foo ( id SERIAL NOT NULL);
INSERT INTO foo ( id ) VALUES (1), (2), (3);
SELECT bar.baz FROM foo AS bar(baz);

That said, it's not cleanly usable in most cases -- you can't alias a column by name, so you have to know the exact order columns appear in (which you might not know if you didn't reflect nor create the table -- a column definition that doesn't appear in the model might be in the table, for instance).

That said, it'd seem like a generic "Set Returning Function" implementation would need to do the following:
* Subclass from GenericFunction to track data types of input values and whatever voodoo is required for bind parameters.  (I think this also gives us the behavior of "SELECT generate_series(...)" being treated like "SELECT FROM ... generate_series(...)..." for free)
* Subclass from FromClause or perhaps even Alias to track types and names of output values in self.c
* Have an alternate/extended 'alias' implementation capable of defining aliases on a per-column level, so AS alias becomes AS alias(column_alias, ...)

So usage might be something like:

series = sql.srfunc.generate_series(1, 10, output=[Column('value', Integer()])
foo = series.alias("foo").column_alias('value', 'v')
bar = foo.alias("bar")  # I'm assuming that aliasing an existing alias Does The Right Thing(tm), I've never tried it.

session.query(series.c.value) 
# SELECT generate_series.value FROM generate_series(1, 10) AS generate_series(value)

session.query(foo.c.v)
# SELECT foo.v FROM generate_series(1, 10) AS foo(v)

session.query(foo.c.v + bar.c.v).select_from(foo).join(bar, foo.c.v < foo.c.v.) 
# SELECT foo.v + bar.v FROM generate_series(1, 10) AS foo(v) JOIN generate_series(1, 10) AS bar(v) ON foo.v < bar.v;



Thoughts?

-- Daniel 

Michael Bayer

unread,
Nov 13, 2013, 7:48:47 PM11/13/13
to sqlal...@googlegroups.com
see attached for demo, should get you started


test.py
signature.asc

Daniel Grace

unread,
Nov 14, 2013, 2:50:20 PM11/14/13
to sqlal...@googlegroups.com
On Wednesday, November 13, 2013 4:48:47 PM UTC-8, Michael Bayer wrote:
[...]
see attached for demo, should get you started

Thanks!  That was a tremendous help.  I've attached the version I ended up working with, using your code as a base.

The main changes, besides some names, are:
  • Changed 'output' to 'columns', which makes more sense.
  • If 'columns' contains strings, produce ColumnClause(name) for each string as convenient shorthand.
  • Added column_map(self, iterable) as a shorthand to repeating column_alias(self, from_, to) multiple times.
  • Add anonymous aliasing support like other aliases have.
This still doesn't entirely support all of that Postgres considers legal syntax -- e.g. these are both legal but can't currently be represented:

SELECT generate_series FROM generate_series(1, 100)
SELECT number FROM generate_series(1, 100) AS number

I'd kind of like to figure out how to support that for just for completeness sake (my idea was to trigger that behavior if no columns were specified), but I think it's a bad convention and would never use it in my own SQL anyways -- so it's not a priority for me.

The more interesting problem, I think, is to be able to determine the output column type based on input -- generate_series() can accept timestamps (and timestamp with time zones) in addition to ints/bigints, and the return value will be the same as the input parameters.  That said, I'm not entirely certain what benefit SQLAlchemy knowing the output column type provides.

I'm also not entirely happy that sqlalchemy.sql.expression.Alias() doesn't handle this correctly, but I'm under the impression that Aliases (as opposed to lowercase-a aliases) shouldn't be constructed directly anyways.

-- Daniel

Daniel Grace

unread,
Nov 14, 2013, 2:51:23 PM11/14/13
to sqlal...@googlegroups.com
Gmail warns you before posting something with "I've attached" and no attachment, but apparently Google Groups does not.
functions.py
Reply all
Reply to author
Forward
0 new messages