nested subqueries in a hybrid expression?

2,198 views
Skip to first unread message

Brian Cherinka

unread,
Jul 29, 2016, 2:11:21 PM7/29/16
to sqlalchemy

I'm trying to build a hybrid property / expression in one of my SQLA base classes, and the expression side requires several subqueries and I cannot for the life of me figure out the correct syntax.  I've tried many different versions / syntaxes for the expression, using the SQL alchemy expression language, the ORM language but I can't get any of it work.  What am I missing here?  Here is my latest attempt. 

The setup is this.  I have a class called Cube, which is my base object.  For each object in that cube table, I need to grab an attribute from a table A and use that attribute to modify an array column from a second table B.  And I want to make this new array queryable, thus the hybridization.   

The equivalent raw SQL is 

select (select (array_agg(unwave.restw)) as restwarr from (select (unnest(w.wavelength)/(1+n.z)) as restw from mangadatadb.wavelength as w) as unwave) from mangadatadb.cube as c join mangasampledb.manga_target as m on m.pk=c.manga_target_pk join mangasampledb.manga_target_to_nsa as t on t.manga_target_pk=m.pk join mangasampledb.nsa as n on n.pk=t.nsa_pk join mangadatadb.wavelength as w on w.pk=c.wavelength_pk limit 5;


The instance side is quite easy.  The class side is proving difficult.   What is the proper way to write this?  I've read through the documentation countless times now and it hasn't helped much.  

class Cube
   
@hybrid_property
   
def restwave(self):
       
if self.target:
            redshift
= self.target.NSA_objects[0].z
            wave
= np.array(self.wavelength.wavelength)
            restwave
= wave/(1+redshift)
           
return restwave
       
else:
           
return None

   
@restwave.expression
   
def restwave(cls):
        session
= db.Session()
        nsaz
= session.query(sampledb.NSA.z.label('z')).\
            join
(sampledb.MangaTargetToNSA, sampledb.MangaTarget, Cube).\
            filter
(Cube.mangaid == cls.mangaid).subquery('nsaz', with_labels=True)
        unwave
= session.query((func.unnest(Wavelength.wavelength)/(1+nsaz.c.z)).label('restw')).subquery('unwave', with_labels=True)
        restwave
= session.query(func.array_agg(unwave.c.restw))

       
return restwave


Trying 
    session.query(datadb.Cube.restwave).filter(datadb.Cube.mangaid=='1-113520').all()

fails with 
    AttributeError: 'Query' object has no attribute 'is_clause_element'


I've also tried this. 

    @restwave.expression
   
def restwave(cls):
        unwave
= select([(func.unnest(Wavelength.wavelength)/(1+sampledb.NSA.z)).label('restw')])
        restwave
= select([func.array_agg(unwave.c.restw).label('restwarr')]).select_from(
           
Cube.__table__.join(Wavelength.wavelength).
            join
(sampledb.MangaTarget).
            join
(sampledb.MangaTargetToNSA).
            join
(sampledb.NSA)
           
)
       
return restwave


and this fails

In [6]: datadb.Cube.restwave
---------------------------------------------------------------------------
ArgumentError                             Traceback (most recent call last)
<ipython-input-6-16300d165395> in <module>()
----> 1 datadb.Cube.restwave


/Users/Brian/anaconda2/lib/python2.7/site-packages/sqlalchemy/ext/hybrid.pyc in __get__(self, instance, owner)
   
738     def __get__(self, instance, owner):
   
739         if instance is None:
--> 740             return self.expr(owner)
   
741         else:
   
742             return self.fget(instance)


/Users/Brian/Work/sdss/repo/sdss/sdss_python_module/marvin/python/sdss/internal/database/utah/mangadb/DataModelClasses.pyc in restwave(cls)
   
454         # unwave = select([(func.unnest(Wavelength.wavelength)/(1+sampledb.NSA.z)).label('restw')])
   
455         # restwave = select([func.array_agg(unwave.c.restw).label('restwarr')]).select_from(
--> 456         #     Cube.__table__.join(Wavelength.wavelength).
   
457         #     join(sampledb.MangaTarget).
   
458         #     join(sampledb.MangaTargetToNSA).


/Users/Brian/anaconda2/lib/python2.7/site-packages/sqlalchemy/sql/selectable.pyc in join(self, right, onclause, isouter)
   
350         """
    351
--> 352         return Join(self, right, onclause, isouter)
    353
    354     def outerjoin(self, right, onclause=None):


/Users/Brian/anaconda2/lib/python2.7/site-packages/sqlalchemy/sql/selectable.pyc in __init__(self, left, right, onclause, isouter)
    658         """

   
659         self.left = _interpret_as_from(left)
--> 660         self.right = _interpret_as_from(right).self_group()
   
661
   
662         if onclause is None:


/Users/Brian/anaconda2/lib/python2.7/site-packages/sqlalchemy/sql/selectable.pyc in _interpret_as_from(element)
     
47         return insp.selectable
     
48     except AttributeError:
---> 49         raise exc.ArgumentError("FROM expression expected")
     
50
     
51

ArgumentError: FROM expression expected



Mike Bayer

unread,
Jul 29, 2016, 3:13:17 PM7/29/16
to sqlal...@googlegroups.com
you might need to change more than this, but at least the fundamental
thing about @expression is that it has to return a column, not a Query
or a select(). On either one, calling as_scalar() will give you a
scalar subquery, e.g. a SELECT interpreted as a column.

Assuming there's still problems because once array_agg is involved,
things generally get crazy, send along a Wavelength, NSA and MangaNSA
model with that Cube and I can try putting it together.
> AttributeError:'Query'objecthas noattribute 'is_clause_element'
> |
>
>
> I've also tried this.
>
> |
> @restwave.expression
> defrestwave(cls):
> unwave
> =select([(func.unnest(Wavelength.wavelength)/(1+sampledb.NSA.z)).label('restw')])
> restwave
> =select([func.array_agg(unwave.c.restw).label('restwarr')]).select_from(
> Cube.__table__.join(Wavelength.wavelength).
> join(sampledb.MangaTarget).
> join(sampledb.MangaTargetToNSA).
> join(sampledb.NSA)
> )
> returnrestwave
>
> |
>
> and this fails
>
> |
> In[6]:datadb.Cube.restwave
> ---------------------------------------------------------------------------
> ArgumentError Traceback(most recent call last)
> --->49 raiseexc.ArgumentError("FROM expression expected")
> 50
> 51
>
> ArgumentError:FROM expression expected
> |
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy+...@googlegroups.com>.
> To post to this group, send email to sqlal...@googlegroups.com
> <mailto:sqlal...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.
Message has been deleted

Brian Cherinka

unread,
Jul 29, 2016, 5:29:45 PM7/29/16
to sqlalchemy

Oh interesting.  I didn't know that about the @expression.  I'll play around with the as_scalar() as well, and see if I can get something to work.  

class Wavelength(Base):
    __tablename__
= 'wavelength'
    __table_args__
= {'autoload': True, 'schema': 'mangadatadb', 'extend_existing': True}


    wavelength
= deferred(Column(ARRAY_D(Float, zero_indexes=True)))


The wavelength table has a single row and single column, which is an array.  

The other table of interest would look something like 

class NSA(Base):
    __tablename__
= 'nsa'
    __table_args__
= ({'autoload': True, 'schema': 'mangasampledb'})


    z
= Column(Float)


This table basically has a float column that corresponds to objects in the main cube (object) table. Each float value is used to modify the array in wavelength to a unique array for that object. 

The Cube class joins to NSA via two tables that are just intermediate linking tables for this purpose  Cube -> Table A -> Table AToB - > Table B (NSA)

class MangaTarget(Base):
    __tablename__
= 'manga_target'
    __table_args__
= {'autoload': True, 'schema': 'mangasampledb'}


class MangaTargetToNSA(Base):
    __tablename__
= 'manga_target_to_nsa'
    __table_args__
= (
       
ForeignKeyConstraint(['manga_target_pk'],
                             
['mangasampledb.manga_target.pk']),
       
ForeignKeyConstraint(['nsa_pk'], ['mangasampledb.nsa.pk']),
       
{'autoload': True, 'schema': 'mangasampledb'})


 The rest can probably be hacked together.   Let me know if you need anything else.  

Brian Cherinka

unread,
Jul 29, 2016, 5:35:52 PM7/29/16
to sqlalchemy
The @expression as column thing is a bit confusing since in the correlated subquery example in the hybrid attribute section, it looks like you are returning a select?  Does the .label() effectively turn it into a column?

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String(100), nullable=False)

    accounts = relationship("SavingsAccount", backref="owner")

    @hybrid_property
    def balance(self):
        return sum(acc.balance for acc in self.accounts)

    @balance.expression
    def balance(cls):
        return select([func.sum(SavingsAccount.balance)]).\
                where(SavingsAccount.user_id==cls.id).\
                label('total_balance')
Message has been deleted

Brian Cherinka

unread,
Aug 2, 2016, 2:45:32 PM8/2/16
to sqlalchemy
So I managed to get something to return using this definition of the @expression, however, I'm not quite there yet.    

    @hybrid_property
    
def restwave(self):
        
if self.target:
            redshift 
= self.target.NSA_objects[0].z
            wave 
= np.array(self.wavelength.wavelength)
            restwave 
= wave/(1+redshift)
            
return restwave
        
else:
            
return None

    
@restwave.expression
    
def restwave(cls): 
        session 
= db.Session()
        nsaz 
= session.query(sampledb.NSA.z.label('z')).\
            join
(sampledb.MangaTargetToNSA, sampledb.MangaTarget, Cube).\
            filter
(Cube.mangaid == cls.mangaid).subquery('nsaz', with_labels=True)
        unwave 
= session.query((func.unnest(Wavelength.wavelength)/(1+nsaz.c.z)).label('restw')).subquery('unwave', with_labels=True)

        restwave 
= session.query(func.array_agg(unwave.c.restw)).as_scalar()


        
return restwave

Using the example query, 

rwquery = session.query(datadb.Cube.restwave).filter(datadb.Cube.mangaid=='1-113520')
rw
= rwquery.first()

I am getting a modified wavelength array, but it's the wrong one, using the wrong ID.  For the ID 1-113520 I should be modifying the wavelength array by (1+0.016765) and instead it's just grabbing the very first value in the NSA.z column, which corresponds to (1+0.099954).  I think this is because my filter condition is not getting passed into the nsaz subquery, where it needs to go.   Do you know how I can pass filter condition parameters down into any subqueries I may have in me expression select statements?  Is that what .correlate() does?

My query looks like

In [24]: print rwquery
SELECT
(SELECT array_agg(unwave.restw) AS array_agg_1
FROM
(SELECT unnest(mangadatadb.wavelength.wavelength) / (%(z_1)s + nsaz.z) AS restw
FROM mangadatadb
.wavelength, (SELECT mangasampledb.nsa.z AS z
FROM mangasampledb
.nsa JOIN mangasampledb.manga_target_to_nsa ON mangasampledb.nsa.pk = mangasampledb.manga_target_to_nsa.nsa_pk JOIN mangasampledb.manga_target ON mangasampledb.manga_target.pk = mangasampledb.manga_target_to_nsa.manga_target_pk JOIN mangadatadb.cube ON mangasampledb.manga_target.pk = mangadatadb.cube.manga_target_pk
WHERE mangadatadb
.cube.mangaid = mangadatadb.cube.mangaid) AS nsaz) AS unwave) AS anon_1
FROM mangadatadb
.cube
WHERE mangadatadb
.cube.mangaid = %(mangaid_1)s




Mike Bayer

unread,
Aug 2, 2016, 3:57:10 PM8/2/16
to sqlal...@googlegroups.com
What I need is a complete .py file that sets up a *minimal* version of
*every* class required, then the Query object, then prints it. I'll
mangle it to do the right thing.

Like this:

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()


class A(Base):
__tablename__ = 'a'
id = Column(Integer, primary_key=True)
bs = relationship("B")

class B(Base):
__tablename__ = 'b'
id = Column(Integer, primary_key=True)
a_id = Column(ForeignKey('a.id'))


s = Session()

q = s.query(A).join(B)

print q
> FROM mangadatadb.wavelength,(SELECT mangasampledb.nsa.z AS z
> FROM mangasampledb.nsa JOIN mangasampledb.manga_target_to_nsa ON
> mangasampledb.nsa.pk =mangasampledb.manga_target_to_nsa.nsa_pk JOIN
> mangasampledb.manga_target ON mangasampledb.manga_target.pk
> =mangasampledb.manga_target_to_nsa.manga_target_pk JOIN mangadatadb.cube
> ON mangasampledb.manga_target.pk =mangadatadb.cube.manga_target_pk
> WHERE mangadatadb.cube.mangaid =mangadatadb.cube.mangaid)AS nsaz)AS
> unwave)AS anon_1
> FROM mangadatadb.cube
> WHERE mangadatadb.cube.mangaid =%(mangaid_1)s
>
> |
>
>
>

Brian Cherinka

unread,
Aug 3, 2016, 9:53:29 AM8/3/16
to sqlalchemy
Awesome. Thanks.  Ok. I'll work on it again, and get back to you as soon as I can.  

Brian Cherinka

unread,
Aug 3, 2016, 11:57:34 AM8/3/16
to sqlalchemy
Ok. Here is my test file.  I tried to set it up as much as I could, but I don't normally set up my db and sessions this way, so you may have to hack a bit here and there to finish some setup.  My original setup has classes from two different schema.  I don't know if that makes any difference.  I've also added some lines to add objects into the tables that I think you can use to recreate my problem.   Let me know if you need anything more in this file. 
test_sqla_hybrid.py

Mike Bayer

unread,
Aug 3, 2016, 1:34:26 PM8/3/16
to sqlal...@googlegroups.com
OK first problem is, the SQL you showed me is:

select (
select (array_agg(unwave.restw)) as restwarr
from (
select (
unnest(w.wavelength)/(1+n.z)) as restw
from mangadatadb.wavelength as w
) as unwave
)
from mangadatadb.cube as c
join mangasampledb.manga_target as m on m.pk=c.manga_target_pk
join mangasampledb.manga_target_to_nsa as t on t.manga_target_pk=m.pk
join mangasampledb.nsa as n on n.pk=t.nsa_pk
join mangadatadb.wavelength as w on w.pk=c.wavelength_pk limit 5;


that has three SELECTs, a LIMIT, and no WHERE.

Then the code has three calls to session.query() inside "def restwave",
no LIMIT. Then theres *another* call to session.query(Cube.restwave)
with a filter() (e.g. a WHERE), and no LIMIT.

These don't match up. Can you show me the *complete* SQL you want at
the very end of the whole thing? I don't know how many levels of SELECT
you want (the code asks for four), and I don't know if you want the
LIMIT inside the subquery, and other things like that. That is, where
does "def restwave()" end and the outer query takes over.

Brian Cherinka

unread,
Aug 3, 2016, 2:00:22 PM8/3/16
to sqlalchemy
Ok.  Yeah, I have been trying many different ways of getting results.  The raw SQL that I'm trying to recreate in SQLA is this (for the restwave column only), which works in postgresql.  The limit was only there to do the filter the results.  You can ignore that limit.

manga=# select c.pk,c.mangaid,c.manga_target_pk, n.z, (select (array_agg(unwave.restw))[0:5] as restwave from (select (unnest(w.wavelength)/(1+n.z)) as restw from mangadatadb.wavelength as w) as unwave)

from mangadatadb.cube as c
    join mangasampledb
.manga_target as m on m.pk=c.manga_target_pk
    join mangasampledb
.manga_target_to_nsa as t on t.manga_target_pk=m.pk
    join mangasampledb
.nsa as n on n.pk=t.
nsa_pk
    join mangadatadb
.wavelength as w on w.pk=c.wavelength_pk;
  pk  
| mangaid | manga_target_pk |         z          |                                                                                    restwave
-------+---------+-----------------+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 
11211 | 1-22286 |           30678 |  0.099954180419445 | {3292.49709827422,3293.25529747001,3294.01371862107,3294.7723617274,3295.53100483373}
 
7729 | 1-22286 |           30678 |  0.099954180419445 | {3292.49709827422,3293.25529747001,3294.01371862107,3294.7723617274,3295.53100483373}
 
11209 | 1-22298 |           15026 | 0.0614774264395237 | {3411.84452637247,3412.63020900144,3413.41612163118,3414.2022642617,3414.98840689221}
 
7727 | 1-22298 |           15026 | 0.0614774264395237 | {3411.84452637247,3412.63020900144,3413.41612163118,3414.2022642617,3414.98840689221}
 
11219 | 1-22301 |              35 |  0.105152934789658 | {3277.00884941768,3277.76348196558,3278.51833542465,3279.27340979488,3280.02848416512}
(5 rows)

In SQLA,  this code returns something but it is the wrong array for the given Cube specified in the filter condition

In [20]: rwquery = session.query(datadb.Cube.restwave).filter(datadb.Cube.mangaid=='1-113520')

In [21]: print rwquery
SELECT
(SELECT array_agg(unwave.restw) AS array_agg_1
FROM
(SELECT unnest(mangadatadb.wavelength.wavelength) / (%(z_1)s + nsaz.z) AS restw
FROM mangadatadb
.wavelength, (SELECT mangasampledb.nsa.z AS z
FROM mangasampledb
.nsa JOIN mangasampledb.manga_target_to_nsa ON mangasampledb.nsa.pk = mangasampledb.manga_target_to_nsa.nsa_pk JOIN mangasampledb.manga_target ON mangasampledb.manga_target.pk = mangasampledb.manga_target_to_nsa.manga_target_pk JOIN mangadatadb.cube ON mangasampledb.manga_target.pk = mangadatadb.cube.manga_target_pk
WHERE mangadatadb
.cube.mangaid = mangadatadb.cube.mangaid) AS nsaz) AS unwave) AS anon_1
FROM mangadatadb
.cube
WHERE mangadatadb
.cube.mangaid = %(mangaid_1)s

In [22]: rwave = rwquery.first()

In [23]: rwave[0][0:10]
Out[23]:
[3292.49709827422,
 
3293.25529747001,
 
3294.01371862107,
 
3294.7723617274,
 
3295.53100483373,
 
3296.28986989532,
 
3297.04895691218,
 
3297.80826588431,
 
3298.56779681171,
 
3299.32732773911]


I do rwquery.first() here instead of .one() or .all() because in my database, I actually have different versions of the same object, 6 versions, which I don't care about.  But that's fine, I can add additional filters later.  In the code setup I sent up, there is only one version of each cube object.   The above results gets produced with the @expression

    @restwave.expression
   
def restwave(cls):
        session
= db.Session()
        nsaz
= session.query(sampledb.NSA.z.label('z')).\
            join
(sampledb.MangaTargetToNSA, sampledb.MangaTarget, Cube).\
            filter
(Cube.mangaid == cls.mangaid).subquery('nsaz', with_labels=True)
        unwave
= session.query((func.unnest(Wavelength.wavelength)/(1+nsaz.c.z)).label('restw')).subquery('unwave', with_labels=True)
        restwave
= session.query(func.array_agg(unwave.c.restw)).as_scalar()

       
return restwave

I don't need any limits and I think I need 4 selects, the one in the query I run explicitly s.query(Cube.restwave)  and three inside the @expression.   When I grab the actual cube instance object and access restwave on the instance side, it works just fine. 

Mike Bayer

unread,
Aug 3, 2016, 2:22:43 PM8/3/16
to sqlal...@googlegroups.com
There is still much ambiguity here and inaccuracy (JOINs on the outside
or JOINs on the inside, the mappings have mistakes like foreign key to
"pk" but no "pk", mappings without primary keys, "autoload" makes no
sense as I don't have your tables, etc.), so I can only guess but
perhaps give you enough clues. It is highly unusual to have a string
of four JOINs inside of a column-based subquery, but when you say
s.query(Cube.restwave), that's what that means here.

import sqlalchemy
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship, deferred, sessionmaker
from sqlalchemy.schema import Column, ForeignKey
from sqlalchemy import func
from sqlalchemy.dialects.postgresql import *
from sqlalchemy.types import Float, Integer, String
from sqlalchemy.ext.hybrid import hybrid_property


Base = declarative_base()
Session = sessionmaker()


class Target(Base):
__tablename__ = 'target'

pk = Column(Integer, primary_key=True)
mangaid = Column(String)


class Wavelength(Base):
__tablename__ = 'wavelength'

pk = Column(Integer, primary_key=True)
wavelength = deferred(Column(ARRAY(Float, zero_indexes=True)))


class NSA(Base):
__tablename__ = 'nsa'

pk = Column(Integer, primary_key=True)
z = Column(Float)


class TargetToNSA(Base):
__tablename__ = 'target_to_nsa'

id = Column(Integer, primary_key=True)
target_pk = Column(Integer, ForeignKey('target.pk'))
nsa_pk = Column(Integer, ForeignKey('nsa.pk'))


class Cube(Base):
__tablename__ = 'cube'

pk = Column(Integer, primary_key=True)
mangaid = Column(String)

target_pk = Column(Integer, ForeignKey('target.pk'))
target = relationship(Target, backref='cubes')

wavelength_pk = Column(Integer, ForeignKey('wavelength.pk'))
wavelength = relationship(Wavelength, backref='cubes')

@hybrid_property
def restwave(self):
raise NotImplementedError()

@restwave.expression
def restwave(cls):
s = Session()
restw = (func.unnest(Wavelength.wavelength) / (1 +
NSA.z)).label("restw")
unwave =
s.query(restw).select_from(Wavelength).correlate(NSA).subquery("unwave")

agg_unwave =
s.query(func.array_agg(unwave.c.restw)).label("restwarr")

joined = s.query(agg_unwave).select_from(
Cube
).join(Target, TargetToNSA, NSA, Wavelength)

return joined.as_scalar()


session = Session()


q = session.query(Cube.restwave).filter(Cube.mangaid == '1-113520')

print q





On 08/03/2016 02:00 PM, Brian Cherinka wrote:
> Ok. Yeah, I have been trying many different ways of getting results.
> The raw SQL that I'm trying to recreate in SQLA is this (for the
> restwave column only), which works in postgresql. The limit was only
> there to do the filter the results. You can ignore that limit.
>
> |
> manga=# select c.pk,c.mangaid,c.manga_target_pk, n.z, (select
> (array_agg(unwave.restw))[0:5] as restwave from (select
> (unnest(w.wavelength)/(1+n.z)) as restw from mangadatadb.wavelength as
> w) as unwave)
> frommangadatadb.cube asc
> join mangasampledb.manga_target asm on m.pk=c.manga_target_pk
> join mangasampledb.manga_target_to_nsa ast on t.manga_target_pk=m.pk
> join mangasampledb.nsa asn on n.pk=t.nsa_pk
> join mangadatadb.wavelength asw on w.pk=c.wavelength_pk;
> pk |mangaid |manga_target_pk | z |
> restwave
> -------+---------+-----------------+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 11211|1-22286|
> 30678| 0.099954180419445|{3292.49709827422,3293.25529747001,3294.01371862107,3294.7723617274,3295.53100483373}
> 7729|1-22286|
> 30678| 0.099954180419445|{3292.49709827422,3293.25529747001,3294.01371862107,3294.7723617274,3295.53100483373}
> 11209|1-22298|
> 15026|0.0614774264395237|{3411.84452637247,3412.63020900144,3413.41612163118,3414.2022642617,3414.98840689221}
> 7727|1-22298|
> 15026|0.0614774264395237|{3411.84452637247,3412.63020900144,3413.41612163118,3414.2022642617,3414.98840689221}
> 11219|1-22301|
> 35| 0.105152934789658|{3277.00884941768,3277.76348196558,3278.51833542465,3279.27340979488,3280.02848416512}
> (5rows)
> |
>
> In SQLA, this code returns something but it is the wrong array for the
> given Cube specified in the filter condition
>
> |
> In[20]:rwquery
> =session.query(datadb.Cube.restwave).filter(datadb.Cube.mangaid=='1-113520')
>
> In[21]:printrwquery
> SELECT (SELECT array_agg(unwave.restw)AS array_agg_1
> FROM (SELECT unnest(mangadatadb.wavelength.wavelength)/(%(z_1)s
> +nsaz.z)AS restw
> FROM mangadatadb.wavelength,(SELECT mangasampledb.nsa.z AS z
> FROM mangasampledb.nsa JOIN mangasampledb.manga_target_to_nsa ON
> mangasampledb.nsa.pk =mangasampledb.manga_target_to_nsa.nsa_pk JOIN
> mangasampledb.manga_target ON mangasampledb.manga_target.pk
> =mangasampledb.manga_target_to_nsa.manga_target_pk JOIN mangadatadb.cube
> ON mangasampledb.manga_target.pk =mangadatadb.cube.manga_target_pk
> WHERE mangadatadb.cube.mangaid =mangadatadb.cube.mangaid)AS nsaz)AS
> unwave)AS anon_1
> FROM mangadatadb.cube
> WHERE mangadatadb.cube.mangaid =%(mangaid_1)s
>
> In[22]:rwave =rwquery.first()
>
> In[23]:rwave[0][0:10]
> Out[23]:
> [3292.49709827422,
> 3293.25529747001,
> 3294.01371862107,
> 3294.7723617274,
> 3295.53100483373,
> 3296.28986989532,
> 3297.04895691218,
> 3297.80826588431,
> 3298.56779681171,
> 3299.32732773911]
>
> |
>
> I do rwquery.first() here instead of .one() or .all() because in my
> database, I actually have different versions of the same object, 6
> versions, which I don't care about. But that's fine, I can add
> additional filters later. In the code setup I sent up, there is only
> one version of each cube object. The above results gets produced with
> the @expression
>
> |
> @restwave.expression
> defrestwave(cls):
> session =db.Session()
> nsaz =session.query(sampledb.NSA.z.label('z')).\
> join(sampledb.MangaTargetToNSA,sampledb.MangaTarget,Cube).\
> filter(Cube.mangaid
> ==cls.mangaid).subquery('nsaz',with_labels=True)
> unwave
> =session.query((func.unnest(Wavelength.wavelength)/(1+nsaz.c.z)).label('restw')).subquery('unwave',with_labels=True)
> restwave =session.query(func.array_agg(unwave.c.restw)).as_scalar()
>
> returnrestwave
> |
>
> I don't need any limits and I think I need 4 selects, the one in the
> query I run explicitly s.query(Cube.restwave) and three inside the
> @expression. When I grab the actual cube instance object and access
> restwave on the instance side, it works just fine.
>

Brian Cherinka

unread,
Aug 4, 2016, 12:03:17 PM8/4/16
to sqlalchemy

Yeah, sorry about that.  Like I said, I don't normally build my classes this way, defining all the columns in the Base Class.  I define all my columns, primary and foreign keys, etc manually first, so my SQLA classes are minimally defined.  And I cobbled together pieces from my real code and your example to build that test code I sent you.  So I forgot some things.  I did say it may not be completely right, and you might have to hack it some.  I apologize.  Here is an attempt at cleaning it up. 

When I try to implement your @expression into my real code, it doesn't work.  I am getting an error about multiple results returned within the subquery. 

    @restwave.expression
   
def restwave(cls):
        s
= db.Session()
        restw
= (func.unnest(Wavelength.wavelength) / (1 + sampledb.NSA.z)).label("restw")
        unwave
= s.query(restw).select_from(Wavelength).correlate(sampledb.NSA).subquery("unwave")

        agg_unwave
= s.query(func.array_agg(unwave.c.restw)).label("restwarr")
        joined
= s.query(agg_unwave).select_from(
             
Cube

             
).join(sampledb.MangaTarget, sampledb.MangaTargetToNSA, sampledb.NSA, Wavelength)
       
return joined.as_scalar()

session.query(datadb.Cube.restwave).filter(datadb.Cube.mangaid=='1-113520').all()

ProgrammingError: (psycopg2.ProgrammingError) more than one row returned by a subquery used as an expression
 
[SQL: 'SELECT (SELECT (SELECT array_agg(unwave.restw) AS array_agg_1 \nFROM (SELECT unnest(mangadatadb.wavelength.wavelength) / (%(z_1)s + mangasampledb.nsa.z) AS restw \nFROM mangadatadb.wavelength) AS unwave) AS restwarr \nFROM mangadatadb.cube JOIN mangasampledb.manga_target ON mangasampledb.manga_target.pk = mangadatadb.cube.manga_target_pk JOIN mangasampledb.manga_target_to_nsa ON mangasampledb.manga_target.pk = mangasampledb.manga_target_to_nsa.manga_target_pk JOIN mangasampledb.nsa ON mangasampledb.nsa.pk = mangasampledb.manga_target_to_nsa.nsa_pk JOIN mangadatadb.wavelength ON mangadatadb.wavelength.pk = mangadatadb.cube.wavelength_pk) AS anon_1 \nFROM mangadatadb.cube \nWHERE mangadatadb.cube.mangaid = %(mangaid_1)s'] [parameters: {'z_1': 1, 'mangaid_1': '1-113520'}]


 I don't know if this is because I have multiple versions of the same object or not.  I tried adding in a selection on version, but the joins did not work out properly.  This NSA table does not need to be joined with the versions.  What I expect the above to return is exactly what the raw SQL returns.  A list of the restwave array for each version of the object with id = '1-113520'.  I thought SQLA was designed precisely to let users do what I'm trying, i.e. construct complex columns involving multiple selects, without having to physically add a new column into the db, or write a new SQL function in the db to call?  

session.query(datadb.Cube.restwave).join(datadb.Cube,datadb.PipelineInfo,datadb.PipelineVersion).filter(datadb.PipelineVersion=='v1_5_1',datadb.Cube.mangaid=='1-113520').all()

InvalidRequestError: Could not find a FROM clause to join from.  Tried joining to <class 'sdss.internal.database.utah.mangadb.DataModelClasses.Cube'>, but got: Can't find any foreign key relationships between 'nsa' and 'cube'.

test_sqla_hybrid.py

Mike Bayer

unread,
Aug 4, 2016, 12:58:53 PM8/4/16
to sqlal...@googlegroups.com
> bya subquery used asan expression
> [SQL:'SELECT (SELECT (SELECT array_agg(unwave.restw) AS array_agg_1
> \nFROM (SELECT unnest(mangadatadb.wavelength.wavelength) / (%(z_1)s +
> mangasampledb.nsa.z) AS restw \nFROM mangadatadb.wavelength) AS unwave)
> AS restwarr \nFROM mangadatadb.cube JOIN mangasampledb.manga_target ON
> mangasampledb.manga_target.pk = mangadatadb.cube.manga_target_pk JOIN
> mangasampledb.manga_target_to_nsa ON mangasampledb.manga_target.pk =
> mangasampledb.manga_target_to_nsa.manga_target_pk JOIN mangasampledb.nsa
> ON mangasampledb.nsa.pk = mangasampledb.manga_target_to_nsa.nsa_pk JOIN
> mangadatadb.wavelength ON mangadatadb.wavelength.pk =
> mangadatadb.cube.wavelength_pk) AS anon_1 \nFROM mangadatadb.cube
> \nWHERE mangadatadb.cube.mangaid =
> %(mangaid_1)s'][parameters:{'z_1':1,'mangaid_1':'1-113520'}]
>
> |
>
> I don't know if this is because I have multiple versions of the same
> object or not. I tried adding in a selection on version, but the joins
> did not work out properly. This NSA table does not need to be joined
> with the versions. What I expect the above to return is exactly what
> the raw SQL returns.

I've yet to see an unambiguous statement of what "the raw SQL" is. If
it is this:

select c.pk,c.mangaid,c.manga_target_pk, n.z,
(select (array_agg(unwave.restw))[0:5] as restwave from (select
(unnest(w.wavelength)/(1+n.z)) as restw from mangadatadb.wavelength as
w) as unwave)
from mangadatadb.cube as c
join mangasampledb.manga_target as m on m.pk=c.manga_target_pk
join mangasampledb.manga_target_to_nsa as t on t.manga_target_pk=m.pk
join mangasampledb.nsa as n on n.pk=t.nsa_pk
join mangadatadb.wavelength as w on w.pk=c.wavelength_pk;


then that does not, and cannot, correspond to the Python code you are
sending. the JOINs would not be inside of the @expression, you'd have
to write them out on the outside.

I've provided a complete example of how to render SQL extremely similar
to what you want, featuring techniques such as correlate() as well as
how to nest the queries appropriately. If at this point you aren't able
to manipulate the code to get what you want, then I'd advise not using a
hybrid for this query at all. It is extremely complicated in this
context, and unless you are deeply familiar with SQLAlchemy APIs, you
would just have a bunch of code that you can't effectively maintain.










A list of the restwave array for each version of
> the object with id = '1-113520'. I thought SQLA was designed precisely
> to let users do what I'm trying, i.e. construct complex columns
> involving multiple selects, without having to physically add a new
> column into the db, or write a new SQL function in the db to call?
>
> |
> session.query(datadb.Cube.restwave).join(datadb.Cube,datadb.PipelineInfo,datadb.PipelineVersion).filter(datadb.PipelineVersion=='v1_5_1',datadb.Cube.mangaid=='1-113520').all()
>
> InvalidRequestError: Could not find a FROM clause to join from. Tried
> joining to <class
> 'sdss.internal.database.utah.mangadb.DataModelClasses.Cube'>, but got:
> Can't find any foreign key relationships between 'nsa' and 'cube'.
>
> |
>
> On Wednesday, August 3, 2016 at 2:22:43 PM UTC-4, Mike Bayer wrote:
>
> There is still much ambiguity here and inaccuracy (JOINs on the outside
> or JOINs on the inside, the mappings have mistakes like foreign key to
> "pk" but no "pk", mappings without primary keys, "autoload" makes no
> sense as I don't have your tables, etc.), so I can only guess but
> perhaps give you enough clues. It is highly unusual to have a string
> of four JOINs inside of a column-based subquery, but when you say
> s.query(Cube.restwave), that's what that means here.
>

Brian Cherinka

unread,
Aug 4, 2016, 1:58:13 PM8/4/16
to sqlalchemy


I've yet to see an unambiguous statement of what "the raw SQL" is.   If
it is this:

select c.pk,c.mangaid,c.manga_target_pk, n.z,
(select (array_agg(unwave.restw))[0:5] as restwave from (select
(unnest(w.wavelength)/(1+n.z)) as restw from mangadatadb.wavelength as
w) as unwave)
from mangadatadb.cube as c
     join mangasampledb.manga_target as m on m.pk=c.manga_target_pk
     join mangasampledb.manga_target_to_nsa as t on t.manga_target_pk=m.pk
     join mangasampledb.nsa as n on n.pk=t.nsa_pk
     join mangadatadb.wavelength as w on w.pk=c.wavelength_pk;
 
then that does not, and cannot, correspond to the Python code you are
sending.  the JOINs would not be inside of the @expression, you'd have
to write them out on the outside. 
 
Yeah, that was the SQL that would be typed into psql.  I know that the Python implementation is incorrect.  I've been trying to sort that out.   


I've provided a complete example of how to render SQL extremely similar
to what you want, featuring techniques such as correlate() as well as
how to nest the queries appropriately.  If at this point you aren't able
to manipulate the code to get what you want, then I'd advise not using a
hybrid for this query at all. It is extremely complicated in this
context, and unless you are deeply familiar with SQLAlchemy APIs, you
would just have a bunch of code that you can't effectively maintain.

Ok..thanks for your help.  I appreciate it.  
 
Reply all
Reply to author
Forward
0 new messages