Feedback appreciated

233 views
Skip to first unread message

Konsta Vesterinen

unread,
Aug 22, 2013, 3:37:36 PM8/22/13
to sqlal...@googlegroups.com

Hi all,


First I want to say how much I appreciate SQLAlchemy. I think it is by far the best ORM available for any language. Thank you Mike for this wonderful piece of software. I can’t imagine how many countless hours you’ve put into this.


From the day I started coding I’ve always been enthusiastic about databases. 2007 I created Doctrine, which is now the most popular ORM for PHP. Over the years I’ve switched to Python (I could’ve switched to Ruby, but didn’t like Rails’ ORM at all). Now I’m the CEO of a company called Fast Monkeys. We employ 15 people currently and develop solely with Python, SQLAlchemy and Flask. One of the key principles of our company is to contribute to our chosen open source technologies as much as possible. There are couple of interesting projects I would like to hear some feedback:



SQLAlchemy-Utils

https://github.com/kvesteri/sqlalchemy-utils


Provides number of things for SQLAlchemy. Some highlights:

  • Batch fetching utilities (experimental at the moment)

    • We had a real life scenario were join loading, lazy loading and subquery loading were all too slow.

    • I used this https://groups.google.com/forum/#!topic/sqlalchemy/vHQlm0U5f2k as an inspiration. I agree this is a bottomless hole but I’m nevertheless willing to try to make something useful for SQLAlchemy users. :)

  • Number of new datatypes



SQLAlchemy-Continuum

https://github.com/kvesteri/sqlalchemy-continuum


Hibernate Envers style versioning for SQLAlchemy declarative models.



WTForms-Alchemy

https://github.com/kvesteri/wtforms-alchemy


Easily create WTForms forms from SQLAlchemy declarative models.




The documentation is pretty poor on first two projects. I will add more docs in the following weeks.

Michael Bayer

unread,
Aug 22, 2013, 6:52:41 PM8/22/13
to sqlal...@googlegroups.com
On Aug 22, 2013, at 3:37 PM, Konsta Vesterinen <konsta.v...@gmail.com> wrote:

Hi all,

First I want to say how much I appreciate SQLAlchemy. I think it is by far the best ORM available for any language. Thank you Mike for this wonderful piece of software. I can’t imagine how many countless hours you’ve put into this.

From the day I started coding I’ve always been enthusiastic about databases. 2007 I created Doctrine, which is now the most popular ORM for PHP.


wow that is quite a compliment, I of course have heard of Doctrine and met many PHP users who use it extensively.   This is really amazing that you're A. using Python now B. using SQLAlchemy and C. writing full blown software for it, wow !

Over the years I’ve switched to Python (I could’ve switched to Ruby, but didn’t like Rails’ ORM at all). Now I’m the CEO of a company called Fast Monkeys. We employ 15 people currently and develop solely with Python, SQLAlchemy and Flask. One of the key principles of our company is to contribute to our chosen open source technologies as much as possible. There are couple of interesting projects I would like to hear some feedback:


SQLAlchemy-Utils

Provides number of things for SQLAlchemy. Some highlights:
  • Batch fetching utilities (experimental at the moment)
    • We had a real life scenario were join loading, lazy loading and subquery loading were all too slow.
    • I used this https://groups.google.com/forum/#!topic/sqlalchemy/vHQlm0U5f2k as an inspiration. I agree this is a bottomless hole but I’m nevertheless willing to try to make something useful for SQLAlchemy users. :)
  • Number of new datatypes


SQLAlchemy-Continuum

Hibernate Envers style versioning for SQLAlchemy declarative models.


WTForms-Alchemy

Easily create WTForms forms from SQLAlchemy declarative models.

These are all really interesting projects and I had a lot of thoughts looking at all of them just briefly.   I also wonder at what points within here should/can some of this be part of SQLA itself, or not.  Here's my notes:

wtforms:

1. in all cases, use inspect(cls) to get at a Mapper.  if on 0.7, use class_mapper(cls).   but it would be great if you could target 0.8 and up as a lot of functions were added for exactly these kinds of use cases (See http://docs.sqlalchemy.org/en/rel_0_8/faq.html#how-do-i-get-a-list-of-all-columns-relationships-mapped-attributes-etc-given-a-mapped-class)

2. ClassManager.values() is not terrible, but again isn't super "public".  you can use mapper.attrs as well as mapper.column_attrs and others as of 0.8.

versioning:

1. wow this is really ambitious.

2. unit_of_work.py: you can tell if a statement is an INSERT/DELETE usually by looking at context.isinsert/context.isdelete, and also the table name 
you can get from context.statement.table (something like that).  Similar things can be done where I see you're regexping the DELETE
later on.   Digging into the string is fine but once you're targeting the broad spectrum of scenarios, like users that are adding SQL comments and such to their SQL, backends that don't actually use SQL, you want to stick with inspecting the expression trees as much as possible.

3. make schema object names configurable, i.e. "transaction_id"

4. This code looks great but I'd still be scared to use it, because versioning is such a particular thing, not to mention
interactions with other schema complexities.    But I don't say that to be discouraging, just to state how non-trivial a problem 
this is.   When i do versioning for real, there's always weird quirks and things 
specific to the app, which are easier to hardcode in my versioning code rather than having to configure a 3rd party library to do it.
it's why i kept it as just an "example" in SQLA itself, it's a huge job...   but if you can make this extension successful,
that'll be very impressive.   In the docs it would be nice if I could see immediately what happens to the SQL schema when I use this.

sqlalchemy_utils:

1. have coercion_listener configure itself?   coercion_listener.configure().  since it's global usually,
and you could always pass a target base class to configure() as an option.

2. ScalarListType vs. Postgresql ARRAY ?   same/better?  should SLT use ARRAY on a PG backend ?

3. operators for types!   I see these are mostly string storage but you can start adding special operations as
needed using TypeEngine.Comparator: http://docs.sqlalchemy.org/en/rel_0_8/core/types.html#types-operators  e.g. look at all the operations that pg.ARRAY has (contains, indexed access, concatenation, etc.). then you can make all these types *really* slick.

4a. batch_fetch - hmmmmmmmmmmmm.....  I see the idea is avoid JOIN by just feeding the keys into an IN (caveat there, IN works well for small lists, but less so for large - Oracle at least limits their size to 1000, when I have to use batch IN I will actually batch within the IN itself in groups of 500 or so).   You know you could build this as a loader strategy.  an API overhaul of that system
is coming up but the LoaderStrategy API shouldn't change much.   Then you could just say query.options(batch_load_all("a.b.c")) like
any other option.   LoaderStrategy isn't an API that people use often but it is extensible, and 0.9 it's even nicer already, with more to come.   I recently posted about it on the development list, if you want to check out sqlalchemy-devel (it's a pretty dead list but I'd value your input).    I will note that the subquery loader strategy, which this is very similar to,  was *very* hard to get working in all cases, compared to how it was super easy to get working for simple cases.   because relationship() is *so* flexible, esp. with things like self-referential inheritance setups, loading gets very hard very quick.

4b. does that IN do tuple lookups for composite primary keys?   you can get that by saying tuple_(*keys).in_(list_of_tuples).

5. sort_query - I think you can avoid the private attribute access if you use query.column_descriptions: http://docs.sqlalchemy.org/en/rel_0_8/orm/query.html?highlight=query.column_descriptions#sqlalchemy.orm.query.Query.column_descriptions I added that for a similar purpose as the one here.

Overall: maybe you want to "brand" all these packages under a common package using namespace packages? (http://www.python.org/dev/peps/pep-0420/ is the future of it, you still need the explicit directives for now).   I do that for "dogpile.", e.g. "dogpile.cache", "dogpile.core", etc.  like monkeysql.wtforms, monkeysql.utils, monkeysql.versioning etc. (not sure of the role of the Monkey character in your canon...).  since your packages do have some dependencies between them.

thanks very much for supporting the project, can we get FastMonkey and such up on http://www.sqlalchemy.org/organizations.html ?     looks great !


- mike





signature.asc

Konsta Vesterinen

unread,
Aug 23, 2013, 3:52:54 PM8/23/13
to sqlal...@googlegroups.com


On Friday, August 23, 2013 1:52:41 AM UTC+3, Michael Bayer wrote:


wow that is quite a compliment, I of course have heard of Doctrine and met many PHP users who use it extensively.   This is really amazing that you're A. using Python now B. using SQLAlchemy and C. writing full blown software for it, wow !


Thanks!

 
These are all really interesting projects and I had a lot of thoughts looking at all of them just briefly.   I also wonder at what points within here should/can some of this be part of SQLA itself, or not.  Here's my notes:

wtforms:

1. in all cases, use inspect(cls) to get at a Mapper.  if on 0.7, use class_mapper(cls).   but it would be great if you could target 0.8 and up as a lot of functions were added for exactly these kinds of use cases (See http://docs.sqlalchemy.org/en/rel_0_8/faq.html#how-do-i-get-a-list-of-all-columns-relationships-mapped-attributes-etc-given-a-mapped-class)

2. ClassManager.values() is not terrible, but again isn't super "public".  you can use mapper.attrs as well as mapper.column_attrs and others as of 0.8.

These are both valid points and I fully agree with you.

 
versioning:

2. unit_of_work.py: you can tell if a statement is an INSERT/DELETE usually by looking at context.isinsert/context.isdelete, and also the table name 
you can get from context.statement.table (something like that).  Similar things can be done where I see you're regexping the DELETE
later on.   Digging into the string is fine but once you're targeting the broad spectrum of scenarios, like users that are adding SQL comments and such to their SQL, backends that don't actually use SQL, you want to stick with inspecting the expression trees as much as possible.

3. make schema object names configurable, i.e. "transaction_id"

Good points. I created issues for both.

 
4. This code looks great but I'd still be scared to use it, because versioning is such a particular thing, not to mention
interactions with other schema complexities.    But I don't say that to be discouraging, just to state how non-trivial a problem 
this is.   When i do versioning for real, there's always weird quirks and things 
specific to the app, which are easier to hardcode in my versioning code rather than having to configure a 3rd party library to do it.
it's why i kept it as just an "example" in SQLA itself, it's a huge job...   but if you can make this extension successful,
that'll be very impressive.   In the docs it would be nice if I could see immediately what happens to the SQL schema when I use this.

I will add this in the docs and I agree the whole thing is a little bit scary. :) I think we can make it a great tool though. One of the things I don't like about Hibernate Envers is its API. With Continuum I tried to get ideas for the API from the best versioning Ruby world has (especially papertrail). 

The schema Continuum generates is basically the same as the one Hibernate Envers generates (with a little bit different naming conventions).

 
sqlalchemy_utils:

1. have coercion_listener configure itself?   coercion_listener.configure().  since it's global usually,
and you could always pass a target base class to configure() as an option.

Good idea.
 
2. ScalarListType vs. Postgresql ARRAY ?   same/better?  should SLT use ARRAY on a PG backend ?

Hmm I'm not sure about this yet. Its definately not better than using PostgreSQL ARRAY. ARRAY is better in many ways but its PostgreSQL specific. Maybe we could make ScalarListType use ARRAY on PostgreSQL by default (as you suggested).

 
3. operators for types!   I see these are mostly string storage but you can start adding special operations as
needed using TypeEngine.Comparator: http://docs.sqlalchemy.org/en/rel_0_8/core/types.html#types-operators  e.g. look at all the operations that pg.ARRAY has (contains, indexed access, concatenation, etc.). then you can make all these types *really* slick.

EmailType already uses CaseInsensitiveComparator. I will add more of these as you suggested once I figure out what kind of operators each type needs. :)
 
4a. batch_fetch - hmmmmmmmmmmmm.....  I see the idea is avoid JOIN by just feeding the keys into an IN (caveat there, IN works well for small lists, but less so for large - Oracle at least limits their size to 1000, when I have to use batch IN I will actually batch within the IN itself in groups of 500 or so).   You know you could build this as a loader strategy.  an API overhaul of that system
is coming up but the LoaderStrategy API shouldn't change much.   Then you could just say query.options(batch_load_all("a.b.c")) like
any other option.   LoaderStrategy isn't an API that people use often but it is extensible, and 0.9 it's even nicer already, with more to come.   I recently posted about it on the development list, if you want to check out sqlalchemy-devel (it's a pretty dead list but I'd value your input).    I will note that the subquery loader strategy, which this is very similar to,  was *very* hard to get working in all cases, compared to how it was super easy to get working for simple cases.   because relationship() is *so* flexible, esp. with things like self-referential inheritance setups, loading gets very hard very quick.

Thanks for clarifying the caveats of IN operator. I didn't know Oracle had that limit. Adding batch fetch as a new LoaderStrategy would indeed be great! I will look into this.
 
4b. does that IN do tuple lookups for composite primary keys?   you can get that by saying tuple_(*keys).in_(list_of_tuples).

Not yet. I added an issue for this.
 
5. sort_query - I think you can avoid the private attribute access if you use query.column_descriptions: http://docs.sqlalchemy.org/en/rel_0_8/orm/query.html?highlight=query.column_descriptions#sqlalchemy.orm.query.Query.column_descriptions I added that for a similar purpose as the one here.

Thanks for pointing this out. Added an issue for this too.
 
Overall: maybe you want to "brand" all these packages under a common package using namespace packages? (http://www.python.org/dev/peps/pep-0420/ is the future of it, you still need the explicit directives for now).   I do that for "dogpile.", e.g. "dogpile.cache", "dogpile.core", etc.  like monkeysql.wtforms, monkeysql.utils, monkeysql.versioning etc. (not sure of the role of the Monkey character in your canon...).  since your packages do have some dependencies between them.

It would be great if some of these packages could become official SQLAlchemy plugins/extensions some day. It would also be great if those extensions could be magically registered as sqlalchemy extensions in Flask-esque way (eg. from sqlalchemy.ext.continuum import VersioningManager).

I'd like to see a section in SQLAlchemy website of 'official' extensions. If you feel some extension is mature and good enough it could be put there to gain good visibility. Currently we could put GeoAlchemy in there?

thanks very much for supporting the project, can we get FastMonkey and such up on http://www.sqlalchemy.org/organizations.html ?     looks great !


- mike


Sure! You could add this text:

'Fast Monkeys is a product development house based in Finland. We develop new web ventures using Python, Flask and SQLAlchemy.'

I'm really looking forward to all of this. Good stuff! :)

-Konsta
 

Michael Bayer

unread,
Sep 4, 2013, 12:34:03 PM9/4/13
to sqlal...@googlegroups.com
On Aug 23, 2013, at 3:52 PM, Konsta Vesterinen <konsta.v...@gmail.com> wrote:


It would be great if some of these packages could become official SQLAlchemy plugins/extensions some day. It would also be great if those extensions could be magically registered as sqlalchemy extensions in Flask-esque way (eg. from sqlalchemy.ext.continuum import VersioningManager).

I'd like to see a section in SQLAlchemy website of 'official' extensions. If you feel some extension is mature and good enough it could be put there to gain good visibility. Currently we could put GeoAlchemy in there?


well we've talked about SQLAlchemy itself supporting some kind of namespace package, but so far I haven't been enthused about that.  For one, the namespace package thing seems to keep changing, so not sure I'd want a huge ecosystem based around a somewhat moving spec (it's not as big a deal if you control all the projects directly).   For another, I get a strong impression that namespace packages make the external lib look like it's part of the library itself, especially to newbies, and the community of SQLAlchemy packages on pypi is really large and of very variable quality, I'd rather have any namespace that starts with "sqlalchemy" be something that can be centrally curated, rather than allowing anyone to name their package "sqlalchemy.contrib" - not to mention if some very low quality package takes over some very common name, like "sqlalchemy.contrib.versioning", then everyone downloads that, when meanwhile there are eight other great versioning systems that nobody finds out about because they don't have the cool namespace.     then if the author of "versioning" just vanishes, now we have "sqlalchemy.contrib.versioning" is like a permanent deadweight.   



Sure! You could add this text:

'Fast Monkeys is a product development house based in Finland. We develop new web ventures using Python, Flask and SQLAlchemy.'


signature.asc

Seth P

unread,
Sep 28, 2016, 10:05:06 AM9/28/16
to sqlalchemy
On Friday, August 23, 2013 at 3:52:54 PM UTC-4, Konsta Vesterinen wrote:


On Friday, August 23, 2013 1:52:41 AM UTC+3, Michael Bayer wrote:
 
2. ScalarListType vs. Postgresql ARRAY ?   same/better?  should SLT use ARRAY on a PG backend ?

Hmm I'm not sure about this yet. Its definately not better than using PostgreSQL ARRAY. ARRAY is better in many ways but its PostgreSQL specific. Maybe we could make ScalarListType use ARRAY on PostgreSQL by default (as you suggested).

I was wondering if there are any plans for SQLAlchemy to support Oracle's VARRAY column type? I've recently had the misfortune of having to use Oracle, and would like to have columns of the form sa.Column(sa.ARRAY(pt.String(8), dimensions=1)). I just looked at SQLAlchemy-Utils's ScalarListType (https://github.com/kvesteri/sqlalchemy-utils/blob/master/sqlalchemy_utils/types/scalar_list.py), but I think then I'd bump into Oracle's 4000-byte limit (https://docs.oracle.com/cd/B19306_01/server.102/b14237/limits001.htm#i287903). (It's not clear to me what Oracle's VARRAY length limit is when used as a column type, but that's a separate issue.) It's also not clear to me if cx_Oracle supports VARRAYs nicely.

Mike Bayer

unread,
Sep 28, 2016, 10:16:20 AM9/28/16
to sqlal...@googlegroups.com


On 09/28/2016 10:05 AM, Seth P wrote:
> On Friday, August 23, 2013 at 3:52:54 PM UTC-4, Konsta Vesterinen wrote:
>
>
>
> On Friday, August 23, 2013 1:52:41 AM UTC+3, Michael Bayer wrote:
>
>
> 2. ScalarListType vs. Postgresql ARRAY ? same/better? should
> SLT use ARRAY on a PG backend ?
>
>
> Hmm I'm not sure about this yet. Its definately not better than
> using PostgreSQL ARRAY. ARRAY is better in many ways but its
> PostgreSQL specific. Maybe we could make ScalarListType use ARRAY on
> PostgreSQL by default (as you suggested).
>
>
> I was wondering if there are any plans for SQLAlchemy to support
> Oracle's VARRAY column type? I've recently had the misfortune of having
> to use Oracle, and would like to have columns of the
> formsa.Column(sa.ARRAY(pt.String(8), dimensions=1)). I just looked at
> SQLAlchemy-Utils's ScalarListType
> (https://github.com/kvesteri/sqlalchemy-utils/blob/master/sqlalchemy_utils/types/scalar_list.py),
> but I think then I'd bump into Oracle's 4000-byte limit
> (https://docs.oracle.com/cd/B19306_01/server.102/b14237/limits001.htm#i287903).
> (It's not clear to me what Oracle's VARRAY length limit is when used as
> a column type, but that's a separate issue.) It's also not clear to me
> if cx_Oracle supports VARRAYs nicely.

someone just asked about this and your last sentence is the main thing
to worry about first, getting cx_oracle support confirmed. Oracle's
OCI is vast, ancient, heavily encrusted, and even really simple things
become awkward in cx_Oracle due to this. I won't bash on cx_Oracle too
much because it just tries to expose OCI as much as possible, but there
are crap-tons of awkwardnesses, "just don't works", and other edge cases
in this area. So illustrating VARRAY round trip on cx_oracle is the
first step.




>
> --
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy+...@googlegroups.com>.
> To post to this group, send email to sqlal...@googlegroups.com
> <mailto:sqlal...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

Seth P

unread,
Sep 28, 2016, 12:45:39 PM9/28/16
to sqlalchemy

On Wednesday, September 28, 2016 at 10:16:20 AM UTC-4, Mike Bayer wrote:

So illustrating VARRAY round trip on cx_oracle is the first step.

It looks like cx_Oracle supports reading varrays, but supports writing them only as column literals, not as bound parameters. The following code tests a bunch of plausible formats for bound parameters, none of which work. This is consistent with https://sourceforge.net/p/cx-oracle/mailman/message/26769899/ .

 
if __name__ == '__main__':

import logging
logging.basicConfig(level='DEBUG')

def format_args(args):
return ', '.join(['"%s"' % arg if isinstance(arg, str) else str(arg) for arg in execute_args])

import cx_Oracle
with cx_Oracle.Connection(user, password, host) as connection:
cursor = connection.cursor()
for drop_object in ("TABLE tb_aaa", "TYPE tp_str_vec", "TYPE tp_num_vec"):
try:
cursor.execute("DROP " + drop_object)
except:
pass
for execute_args in [("CREATE OR REPLACE TYPE tp_str_vec as VARRAY(3) of VARCHAR2(8)",),
("CREATE OR REPLACE TYPE tp_num_vec as VARRAY(3) of NUMBER",),
("CREATE TABLE tb_aaa ( strvec tp_str_vec, numvec tp_num_vec )",),]:
logging.info(format_args(execute_args) + "\n")
cursor.execute(*execute_args)

for i, execute_args in enumerate([
(("INSERT INTO tb_aaa VALUES ( tp_str_vec(%(strvec1)s), tp_num_vec(%(numvec1)s) )" %
{"strvec1": str(['A1', 'A2', 'A3'])[1:-1], "numvec1": str([1.1, 1.2, 1.3])[1:-1]}),),
("INSERT INTO tb_aaa VALUES ( tp_str_vec(:strvec1), tp_num_vec(:numvec1) )",
{"strvec1": ['B1', 'B2', 'B3'], "numvec1": [2.1, 2.2, 2.3]}),
("INSERT INTO tb_aaa VALUES ( tp_str_vec(:strvec1), tp_num_vec(:numvec1) )",
{"strvec1": str(['C1', 'C2', 'C3'])[1:-1], "numvec1": str([3.1, 3.2, 3.3])[1:-1]}),
("INSERT INTO tb_aaa VALUES ( tp_str_vec(:strvec1), tp_num_vec(:numvec1) )",
{"strvec1": str(['D1', 'D2', 'D3']), "numvec1": str([4.1, 4.2, 4.3])}),
("INSERT INTO tb_aaa VALUES ( tp_str_vec(:strvec1), tp_num_vec(:numvec1) )",
{"strvec1": '(%s)' % str(['E1', 'E2', 'E3'])[1:-1], "numvec1": '(%s)' % str([5.1, 5.2, 5.3])[1:-1]}),
("INSERT INTO tb_aaa VALUES ( :strvec1, :numvec1 )",
{"strvec1": ['F1', 'F2', 'F3'], "numvec1": [6.1, 6.2, 6.3]}),
(("INSERT INTO tb_aaa VALUES ( (%(strvec1)s), (%(numvec1)s) )" %
{"strvec1": str(['G1', 'G2', 'G3'])[1:-1], "numvec1": str([7.1, 7.2, 7.3])[1:-1]}),),
(("INSERT INTO tb_aaa VALUES ( %(strvec1)s, %(numvec1)s )" %
{"strvec1": str(['H1', 'H2', 'H3']), "numvec1": str([8.1, 8.2, 8.3])}),),
]):
try:
logging.info("INSERT #%d: %s" % (i, format_args(execute_args)))
cursor.execute(*execute_args)
logging.info("INSERT #%d succeeded.\n" % i)
except cx_Oracle.DatabaseError as e:
logging.info("INSERT #%d failed: " % i + str(e))

cursor.execute("SELECT * FROM tb_aaa")
result = cursor.fetchall()
logging.info("SELECT returns:\n" + str(result))


INFO:root:"CREATE OR REPLACE TYPE tp_str_vec as VARRAY(3) of VARCHAR2(8)"

INFO:root:"CREATE OR REPLACE TYPE tp_num_vec as VARRAY(3) of NUMBER"

INFO:root:"CREATE TABLE tb_aaa ( strvec tp_str_vec, numvec tp_num_vec )"

INFO:root:INSERT #0: "INSERT INTO tb_aaa VALUES ( tp_str_vec('A1', 'A2', 'A3'), tp_num_vec(1.1, 1.2, 1.3) )"
INFO:root:INSERT #0 succeeded.

INFO:root:INSERT #1: "INSERT INTO tb_aaa VALUES ( tp_str_vec(:strvec1), tp_num_vec(:numvec1) )", {'numvec1': [2.1, 2.2, 2.3], 'strvec1': ['B1', 'B2', 'B3']}
INFO:root:INSERT #1 failed: ORA-01484: arrays can only be bound to PL/SQL statements

INFO:root:INSERT #2: "INSERT INTO tb_aaa VALUES ( tp_str_vec(:strvec1), tp_num_vec(:numvec1) )", {'numvec1': '3.1, 3.2, 3.3', 'strvec1': "'C1', 'C2', 'C3'"}
INFO:root:INSERT #2 failed: ORA-22814: attribute or element value is larger than specified in type

INFO:root:INSERT #3: "INSERT INTO tb_aaa VALUES ( tp_str_vec(:strvec1), tp_num_vec(:numvec1) )", {'numvec1': '[4.1, 4.2, 4.3]', 'strvec1': "['D1', 'D2', 'D3']"}
INFO:root:INSERT #3 failed: ORA-22814: attribute or element value is larger than specified in type

INFO:root:INSERT #4: "INSERT INTO tb_aaa VALUES ( tp_str_vec(:strvec1), tp_num_vec(:numvec1) )", {'numvec1': '(5.1, 5.2, 5.3)', 'strvec1': "('E1', 'E2', 'E3')"}
INFO:root:INSERT #4 failed: ORA-22814: attribute or element value is larger than specified in type

INFO:root:INSERT #5: "INSERT INTO tb_aaa VALUES ( :strvec1, :numvec1 )", {'numvec1': [6.1, 6.2, 6.3], 'strvec1': ['F1', 'F2', 'F3']}
INFO:root:INSERT #5 failed: ORA-01484: arrays can only be bound to PL/SQL statements

INFO:root:INSERT #6: "INSERT INTO tb_aaa VALUES ( ('G1', 'G2', 'G3'), (7.1, 7.2, 7.3) )"
INFO:root:INSERT #6 failed: ORA-00907: missing right parenthesis

INFO:root:INSERT #7: "INSERT INTO tb_aaa VALUES ( ['H1', 'H2', 'H3'], [8.1, 8.2, 8.3] )"
INFO:root:INSERT #7 failed: ORA-00936: missing expression

INFO:root:SELECT returns:
[(['A1', 'A2', 'A3'], [1.1, 1.2, 1.3])]


Mike Bayer

unread,
Sep 28, 2016, 5:43:04 PM9/28/16
to sqlal...@googlegroups.com


On 09/28/2016 12:45 PM, Seth P wrote:
>
> On Wednesday, September 28, 2016 at 10:16:20 AM UTC-4, Mike Bayer wrote:
>
>
> So illustrating VARRAY round trip on cx_oracle is the first step.
>
>
> It looks like cx_Oracle supports reading varrays, but supports writing
> them only as column literals, not as bound parameters. The following
> code tests a bunch of plausible formats for bound parameters, none of
> which work. This is consistent with
> https://sourceforge.net/p/cx-oracle/mailman/message/26769899/ .

looks incredibly difficult. I'm not really about to have the resources
to work with a type that awkward anytime soon, unfortunately. If it
could be made to be a drop-in for 1.1's ARRAY feature, that would be
helpful but it at least needs bound parameter support to be solid.
cx_Oracle should have some hooks to help with this.

Seth P

unread,
Sep 28, 2016, 6:48:27 PM9/28/16
to sqlalchemy
On Wednesday, September 28, 2016 at 5:43:04 PM UTC-4, Mike Bayer wrote:
looks incredibly difficult.   I'm not really about to have the resources
to work with a type that awkward anytime soon, unfortunately.   If it
could be made to be a drop-in for 1.1's ARRAY feature, that would be
helpful but it at least needs bound parameter support to be solid.

Would it be possible to add read-only support? It looks like cx_Oracle returns selected varray values in a pretty straightforward form.
That would still be very useful (at least in my case, where I would be populating the database using SQL*Loader anyway).

Mike Bayer

unread,
Sep 29, 2016, 9:45:24 AM9/29/16
to sqlal...@googlegroups.com
you can add your own types to do these things also, especially
read-only, just make any subclass of UserDefinedType and apply whatever
result-row handling is needed for how cx_Oracle is returning the data.

The hard part about types is the elaborate expression support (e.g. like
JSON foo ->> bar vs. foo -> bar in PG for example). Reading and
writing a value is not that hard and especially if the type is just
specific to what you need right now, you don't have the burden of making
sure your type works for all versions / flags / settings of Oracle /
cx_Oracle etc.

Seth P

unread,
Sep 30, 2016, 4:32:53 PM9/30/16
to sqlalchemy


On Thursday, September 29, 2016 at 9:45:24 AM UTC-4, Mike Bayer wrote:
you can add your own types to do these things also, especially
read-only, just make any subclass of UserDefinedType and apply whatever
result-row handling is needed for how cx_Oracle is returning the data.

The hard part about types is the elaborate expression support (e.g. like
JSON foo ->> bar vs. foo -> bar in PG for example).   Reading and
writing a value is not that hard and especially if the type is just
specific to what you need right now, you don't have the burden of making
sure your type works for all versions / flags / settings of Oracle /
cx_Oracle etc.


The following seems to work fine for my purposes. (It uses alchy, https://github.com/dgilland/alchy).

Is there a way to specify that any column of type VARRAY should always be bound literally, so that one doesn't need to specify explicitly .compile(compile_kwargs={"literal_binds": True}) (and thereby literally binding all fields)?

Also, is there a way, inside VARRAY.__init__() or some other place that is called before table creation to specify the sa.event.listen(<table>, "before_create", self.create_ddl().execute_if(dialect='oracle'))?


import six
import sqlalchemy as sa


class VARRAY(sa.types.UserDefinedType):

def __init__(self, type_name, size_limit, item_type, nullable=True, as_tuple=False):
super(VARRAY, self).__init__()
self.type_name = type_name
self.size_limit = size_limit
self.item_type = item_type
self.nullable = nullable
self.as_tuple = as_tuple

def compile(self, dialect=None):
return self.type_name

def get_col_spec(self, **kw):
return self.type_name

def create_ddl(self, dialect=None, or_replace=True):
sql = "CREATE "
if or_replace:
sql += "OR REPLACE "
sql += "TYPE %(schema)s.{} AS VARRAY({}) OF {}".format(self.type_name, self.size_limit,
self.item_type.compile(dialect=dialect))
if not self.nullable:
sql += " NOT NULL"
return sa.DDL(sql)

def process_literal_param(self, value, dialect):
return "{}({})".format(self.type_name,
','.join("NULL" if x is None else
("'%s'" % x) if isinstance(x, six.string_types) else str(x)
for x in value))

def literal_processor(self, dialect):
def processor(value):
return self.process_literal_param(value, dialect)
return processor

def process_result_value(self, value, dialect):
if self.as_tuple:
value = tuple(value)
return value

def result_processor(self, dialect, coltype):
def processor(value):
return self.process_result_value(value, dialect)
return processor

def copy(self):
return VARRAY(self.type_name, self.size_limit, self.item_type,
nullable=self.nullable, as_tuple=self.as_tuple)




if __name__ == '__main__':

uri = "oracle://user:password@host"

import alchy
import sqlalchemy.dialects.oracle as oc

db = alchy.Manager(config={'SQLALCHEMY_DATABASE_URI': uri})

class TestVarray(db.Model):
__tablename__ = 'test_varray'
__table_args__ = { 'schema': 'myschema' }
idx = sa.Column(sa.Integer, primary_key=True)
label = sa.Column(sa.String(20), nullable=False)
words = sa.Column(VARRAY("tp_test_varray_words", 3000, sa.String(8), nullable=True), nullable=False)
numbers = sa.Column(VARRAY("tp_test_varray_numbers", 3000, oc.NUMBER(), nullable=True), nullable=False)

sa.event.listen(TestVarray.__table__, "before_create",
TestVarray.column_attrs()['words'].columns[0].type.create_ddl().execute_if(dialect='oracle'))

sa.event.listen(TestVarray.__table__, "before_create",
TestVarray.column_attrs()['numbers'].columns[0].type.create_ddl().execute_if(dialect='oracle'))

db.drop_all()
db.create_all()

db.engine.execute(TestVarray.__table__.insert({'idx': 1,
'label': 'One',
'words': ['Once', 'upon', 'a', 'time'],
'numbers': [1.1, 1.2]}).
compile(compile_kwargs={"literal_binds": True}))
db.engine.execute(TestVarray.__table__.insert({'idx': 2,
'label': 'Two',
'words': ['To', 'be', 'or', 'not'],
'numbers': [2.1, 2.2]}).
compile(compile_kwargs={"literal_binds": True}))

print TestVarray.query.all()
print db.session().query(TestVarray.label, TestVarray.words, TestVarray.numbers).all()


[<TestVarray(idx=1, label='One', words=['Once', 'upon', 'a', 'time'], numbers=[1.1, 1.2])>,
<TestVarray(idx=2, label='Two', words=['To', 'be', 'or', 'not'], numbers=[2.1, 2.2])>]
[('One', ['Once', 'upon', 'a', 'time'], [1.1, 1.2]),
('Two', ['To', 'be', 'or', 'not'], [2.1, 2.2])]


Mike Bayer

unread,
Sep 30, 2016, 7:09:09 PM9/30/16
to sqlal...@googlegroups.com


On 09/30/2016 04:32 PM, Seth P wrote:
>
>
> On Thursday, September 29, 2016 at 9:45:24 AM UTC-4, Mike Bayer wrote:
>
> you can add your own types to do these things also, especially
> read-only, just make any subclass of UserDefinedType and apply whatever
> result-row handling is needed for how cx_Oracle is returning the data.
>
> The hard part about types is the elaborate expression support (e.g.
> like
> JSON foo ->> bar vs. foo -> bar in PG for example). Reading and
> writing a value is not that hard and especially if the type is just
> specific to what you need right now, you don't have the burden of
> making
> sure your type works for all versions / flags / settings of Oracle /
> cx_Oracle etc.
>
>
>
> The following seems to work fine for my purposes. (It uses alchy,
> https://github.com/dgilland/alchy).
>
> Is there a way to specify that any column of type VARRAY should always
> be bound literally, so that one doesn't need to specify explicitly
> .compile(compile_kwargs={"literal_binds": True}) (and thereby literally
> binding all fields)?

ha see you *are* doing the write side.

the bind_expression() hook is here to allow you to re-render the
expression. assuming value-bound bindparam() objects (e.g. not like
you'd get with an INSERT or UPDATE usually), the value should be present
and you can do this (had to work up a POC):

from sqlalchemy import *
from sqlalchemy.types import UserDefinedType


class T(UserDefinedType):

def bind_expression(self, colexpr):
return literal_column(colexpr.value) # or whatever is needed here

t = table('t', column('x', T()))

print t.select().where(t.c.x == 'hi')




>
> Also, is there a way, inside VARRAY.__init__() or some other place that
> is called before table creation to specify the sa.event.listen(<table>,
> "before_create", self.create_ddl().execute_if(dialect='oracle'))?


look into adding SchemaType as a mixin, it signals to the owning Column
that it should receive events. You can then add the events to your
type itself like before_parent_attach which should fire for the Column.

spad...@gmail.com

unread,
Oct 1, 2016, 12:53:06 PM10/1/16
to sqlal...@googlegroups.com
I couldn't help but try to add the write side, even though for now I have no need for it.

I'll look at your suggestions below. Thanks.

Seth P

unread,
Oct 3, 2016, 5:21:34 PM10/3/16
to sqlalchemy


On Friday, September 30, 2016 at 7:09:09 PM UTC-4, Mike Bayer wrote:
the bind_expression() hook is here to allow you to re-render the
expression.  assuming value-bound bindparam() objects (e.g. not like
you'd get with an INSERT or UPDATE usually), the value should be present
and you can do this (had to work up a POC):

from sqlalchemy import *
from sqlalchemy.types import UserDefinedType


class T(UserDefinedType):

     def bind_expression(self, colexpr):
         return literal_column(colexpr.value)  # or whatever is needed here

t = table('t', column('x', T()))

print t.select().where(t.c.x == 'hi')




>
> Also, is there a way, inside VARRAY.__init__() or some other place that
> is called before table creation to specify the sa.event.listen(<table>,
> "before_create", self.create_ddl().execute_if(dialect='oracle'))?


look into adding SchemaType as a mixin, it signals to the owning Column
that it should receive events.   You can then add the events to your
type itself like before_parent_attach which should fire for the Column.


OK, I got it working pretty much as desired.

Adding
    def bind_expression(self, bindvalue):
return sa.literal_column(self.process_literal_param(bindvalue.value, None), self)
makes insert and update statements work.

I also got the drop/create business working automatically by copying code from the Postgresql ENUM implementation, though it seems like an excessive amount of boilerplate. To keep the code as-is I had to monkey-patch OracleDialect to add a has_type() method -- any chance you'd want to add that for 1.1?

import six
import sqlalchemy as sa


# Moneky-patch OracleDialect to have has_type() mehtod
from sqlalchemy.dialects.oracle.base import OracleDialect

def has_type(self, connection, type_name, schema=None):
if not schema:
schema = self.default_schema_name
cursor = connection.execute(
sa.sql.text("SELECT type_name FROM all_types "
"WHERE type_name = :name AND owner = :schema_name"),
name=self.denormalize_name(type_name),
schema_name=self.denormalize_name(schema))
return cursor.first() is not None

OracleDialect.has_type = has_type


class VARRAY(sa.types.UserDefinedType, sa.types.SchemaType):

def __init__(self, name, size_limit, item_type, nullable=True, as_tuple=False,
inherit_schema=True, create_type=True, **kw):
sa.types.UserDefinedType.__init__(self)
sa.types.SchemaType.__init__(self, name=name, inherit_schema=inherit_schema, **kw)

self.size_limit = size_limit
self.item_type = item_type
self.nullable = nullable
self.as_tuple = as_tuple
        self.create_type = create_type

def get_col_spec(self):
return (self.schema + '.' + self.name) if self.schema else self.name

def compile(self, dialect=None):
return (self.schema + '.' + self.name) if self.schema else self.name

def create(self, bind=None, checkfirst=False):
if not checkfirst or \
not bind.dialect.has_type(
bind, self.name, schema=self.schema):
sql = "CREATE TYPE {} AS VARRAY({}) OF {}".format(self.compile(dialect=bind.dialect),
self.size_limit,
self.item_type.compile(dialect=bind.dialect))

if not self.nullable:
sql += " NOT NULL"
            bind.execute(sql)

def drop(self, bind=None, checkfirst=False):
if not checkfirst or \
bind.dialect.has_type(bind, self.name, schema=self.schema):
bind.execute("DROP TYPE " + self.compile(dialect=bind.dialect))

def _check_for_name_in_memos(self, checkfirst, kw):
"""Look in the 'ddl runner' for 'memos', then
note our name in that collection.

This to ensure a particular named enum is operated
upon only once within any kind of create/drop
sequence without relying upon "checkfirst".
"""
if not self.create_type:
return True
if '_ddl_runner' in kw:
ddl_runner = kw['_ddl_runner']
if '_oc_varrays' in ddl_runner.memo:
pg_enums = ddl_runner.memo['_oc_varrays']
else:
pg_enums = ddl_runner.memo['_oc_varrays'] = set()
present = self.name in pg_enums
pg_enums.add(self.name)
return present
else:
return False

def _on_table_create(self, target, bind, checkfirst, **kw):
if checkfirst or (
not self.metadata and
not kw.get('_is_metadata_operation', False)) and \
not self._check_for_name_in_memos(checkfirst, kw):
self.create(bind=bind, checkfirst=checkfirst)

def _on_table_drop(self, target, bind, checkfirst, **kw):
if not self.metadata and \
not kw.get('_is_metadata_operation', False) and \
not self._check_for_name_in_memos(checkfirst, kw):
self.drop(bind=bind, checkfirst=checkfirst)

def _on_metadata_create(self, target, bind, checkfirst, **kw):
if not self._check_for_name_in_memos(checkfirst, kw):
self.create(bind=bind, checkfirst=checkfirst)

def _on_metadata_drop(self, target, bind, checkfirst, **kw):
if not self._check_for_name_in_memos(checkfirst, kw):
self.drop(bind=bind, checkfirst=checkfirst)

def process_literal_param(self, value, dialect):
return "{}({})".format(self.compile(dialect=dialect),

','.join("NULL" if x is None else
("'%s'" % x) if isinstance(x, six.string_types) else str(x)
for x in value))

def literal_processor(self, dialect):
def processor(value):
return self.process_literal_param(value, dialect)
return processor

    def bind_expression(self, bindvalue):
return sa.literal_column(self.process_literal_param(bindvalue.value, None), self)


def process_result_value(self, value, dialect):
if self.as_tuple:
value = tuple(value)
return value

def result_processor(self, dialect, coltype):
def processor(value):
return self.process_result_value(value, dialect)
return processor

    def adapt(self, impltype, **kw):
return sa.types.SchemaType.adapt(self, impltype,
size_limit=self.size_limit,
item_type=self.item_type,
nullable=self.nullable,
as_tuple=self.as_tuple,
**kw)



if __name__ == '__main__':

uri = "oracle://user:password@host"

import alchy
import sqlalchemy.dialects.oracle as oc

    db = alchy.Manager(config={'SQLALCHEMY_DATABASE_URI': uri, 'SQLALCHEMY_ECHO': True})


class TestVarray(db.Model):
__tablename__ = 'test_varray'
        __table_args__ = { 'schema': 'barra' }

idx = sa.Column(sa.Integer, primary_key=True)
label = sa.Column(sa.String(20), nullable=False)
        words = sa.Column(VARRAY("tp_test_varray_words", 3000, sa.String(8), nullable=True, schema='barra'), nullable=False)
numbers = sa.Column(VARRAY("tp_test_varray_numbers", 3000, oc.NUMBER(), nullable=True, inherit_schema=True), nullable=False)


db.drop_all()
db.create_all()

db.engine.execute(TestVarray.__table__.insert({'idx': 1,
'label': 'One',
'words': ['Once', 'upon', 'a', 'time'],
'numbers': [1.1, 1.2]}).
compile(compile_kwargs={"literal_binds": True}))
db.engine.execute(TestVarray.__table__.insert({'idx': 2,
'label': 'Two',
'words': ['To', 'be', 'or', 'not'],
                                                   'numbers': [2.1, 2.2]}))
db.engine.execute(TestVarray.__table__.update().
where(TestVarray.__table__.c.idx == 1).
values(numbers=[1.1111, 1.2222]))


print TestVarray.query.all()
print db.session().query(TestVarray.label, TestVarray.words, TestVarray.numbers).all()

[<TestVarray(idx=1, label='One', words=['Once', 'upon', 'a', 'time'], numbers=[1.1111, 1.2222])>,

<TestVarray(idx=2, label='Two', words=['To', 'be', 'or', 'not'], numbers=[2.1, 2.2])>]
[('One', ['Once', 'upon', 'a', 'time'], [1.1111, 1.2222]),

Mike Bayer

unread,
Oct 3, 2016, 5:54:36 PM10/3/16
to sqlal...@googlegroups.com
I can see that a full implementation in SQLA would benefit from an
OracleDialect.has_type() method but I don't see why, in the case here of
external implementation, why has_type() has to be on the OracleDialect?
You've got the bind right there and you're calling other SQL on it
inline within your type.

Seth P

unread,
Oct 3, 2016, 6:54:39 PM10/3/16
to sqlal...@googlegroups.com
You're right, of course. Adding has_type() to OracleDialect is more a matter of taste than a necessity.

Thanks again for all your help. I'm actually amazed at how well/transparent it works, given the cx_Oracle limitations.


_____________________________
From: Mike Bayer <mik...@zzzcomputing.com>
Sent: Monday, October 3, 2016 5:54 PM
Subject: Re: [sqlalchemy] Feedback appreciated
To: <sqlal...@googlegroups.com>
You received this message because you are subscribed to a topic in the Google Groups "sqlalchemy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sqlalchemy/BGWLUjaFtpY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages