--
-Jason
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.
well we just added the OID type in 0.9.5, so you at least have that.
The PG LOB feature is very sparsely documented - on PG's docs, they only seem to document the C API (www.postgresql.org/docs/9.2/static/lo-interfaces.html), feel free to point me to better documentation on this.
As far as transparently embedding this into INSERT/UPDATE, my first thought is that this might be unwise considering how far removed these functions are from any kind of mainstream usage in such statements - particularly if separate statements have to be called per value to get at OIDs or similar. That PG's docs barely mention this whole feature much less any kind of regular SQL integration is a red flag. PG's BYTEA type is already arbitrarily large so there is probably not much interest in a type like this. If it's the "streaming" feature you're looking for, SQLA's usual approach such as that of Oracle's LOB is to "pre-stream" it on the result set side (necessary, because fetching a batch of rows requires it), and cx_Oracle doesn't provide too much option to stream on the write side. I've dealt a lot with "streaming" datatypes back in the day but sometime in the early 2000's everyone just stopped using them.
As the feature involves SQL functions I don't think you'd be dealing only with bind_processor(), the SQL functions themselves would probably be via SQL-level processing, see http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing. It's true these functions aren't given access to the compiler context where you'd be able to discern INSERT from UPDATE, so I'd probably use two variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd apply these to an INSERT or UPDATE statement at runtime probably using a @compiles decorator - http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take the existing LOB type and just swap it out for InsertLOB/UpdateLOB. Adding compiler context to TypeEngine is something we could look into as a feature as well so that bind_expression() has this available somehow and switching around types wouldn't be needed.
building this would involve stringing together hooks that are not quite set up for this, however for me to really assist here I'd need to see exact examples of what INSERT, UPDATE and anything else looks like in conjunction with these functions.
Seems to fly in the face at the point of SQLA although integration difficulties are appreciated. Most advanced postgresql drivers in any language bindings have added support for this type although none of them can hide that it's file like.
PG's BYTEA is NOT arbitrarily large, it has quite real practical limits and I've hit them regularly in storing compressed HDF5 documents in the database as part of a bench testing framework. The theoretical limit is 1GB but this limit is far less in practice (http://www.postgresql.org/message-id/CAFj8pRAcfKoiNp2uXeiZOd5k...@mail.gmail.com ) . http://www.postgresql.org/docs/9.3/static/storage-toast.html I'm not looking for streaming, retrieve/set binary buffers. It'd be nice to translate it transparently to HDF5 python in-memory objects (ie h5py).
As the feature involves SQL functions I don't think you'd be dealing only with bind_processor(), the SQL functions themselves would probably be via SQL-level processing, see http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing. It's true these functions aren't given access to the compiler context where you'd be able to discern INSERT from UPDATE, so I'd probably use two variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd apply these to an INSERT or UPDATE statement at runtime probably using a @compiles decorator - http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take the existing LOB type and just swap it out for InsertLOB/UpdateLOB. Adding compiler context to TypeEngine is something we could look into as a feature as well so that bind_expression() has this available somehow and switching around types wouldn't be needed.
Separate datatypes is clearly not a good approach to this.
Everything starts with a transaction block.
Hand crafted inserts use server side lo_create(-1) (in sql) which allocates a new large object and returning clause to get the oid in one go back to the client side. Then I start using the lobject api on the returned oid.
In patches I've made to cppdb, in the bind of std::istream I call client side lo_create, lo_open, a sequence of lo_writes, and finally lo_close. In retrospect this doesn't handle updates particularly well and maybe this could be used in sqlalchemy as is too - an update just allocates a new large object and unreferences the old one.
Just in case this wasn't apparent, you certainly *can* use psycopg2's bindings when you're in an otherwise SQLAlchemy app. Worst case you can retrieve a raw psycopg2 connection using connection.raw_connection and do whatever you need. If you truly have some use for LOBs, SQLAlchemy isn't preventing you from using it, it's just not providing any niceties around it. The fact that these unusual use cases are not ever prohibited by SQLA further raises the bar to adding first class support for them.
Python translation is very easy in SQLA, its just if you have special needs for SQL syntaxes, that's where special behaviors may be needed. So far it sounds like the only blocking factor is that bind_sql needs to distinguish between INSERT and UPDATE. that's not a terribly tall order though it is inconvenient in that the API would need a backwards-compatibility layer.
PG's BYTEA is NOT arbitrarily large, it has quite real practical limits and I've hit them regularly in storing compressed HDF5 documents in the database as part of a bench testing framework. The theoretical limit is 1GB but this limit is far less in practice (http://www.postgresql.org/message-id/CAFj8pRAcfKoiNp2uXeiZOd5k...@mail.gmail.com ) . http://www.postgresql.org/docs/9.3/static/storage-toast.html I'm not looking for streaming, retrieve/set binary buffers. It'd be nice to translate it transparently to HDF5 python in-memory objects (ie h5py).
Let me clarify that these separate datatypes would be totally invisible to the user. The user would work with a single LOB type. Translation to Insert/Update versions would not be explicit and would occur at the point at which the insert/update construct is compiled. At the moment this may be the best approach short of modifying the library (but then again I can't say much about the approach because i have little idea what the SQL we're talking about looks like).
As the feature involves SQL functions I don't think you'd be dealing only with bind_processor(), the SQL functions themselves would probably be via SQL-level processing, see http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing. It's true these functions aren't given access to the compiler context where you'd be able to discern INSERT from UPDATE, so I'd probably use two variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd apply these to an INSERT or UPDATE statement at runtime probably using a @compiles decorator - http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take the existing LOB type and just swap it out for InsertLOB/UpdateLOB. Adding compiler context to TypeEngine is something we could look into as a feature as well so that bind_expression() has this available somehow and switching around types wouldn't be needed.
Separate datatypes is clearly not a good approach to this.
the psycopg2 DBAPI is implicitly in a transaction block at all times unless "autocommit" mode is set up, so there shouldn't be any problem here.
Everything starts with a transaction block.
I'd like to see explicit SQL, preferably in the form of a psycopg2 script that illustrates all the operations you wish to support and specifically how they must interact with the database.
Hand crafted inserts use server side lo_create(-1) (in sql) which allocates a new large object and returning clause to get the oid in one go back to the client side. Then I start using the lobject api on the returned oid.
In patches I've made to cppdb, in the bind of std::istream I call client side lo_create, lo_open, a sequence of lo_writes, and finally lo_close. In retrospect this doesn't handle updates particularly well and maybe this could be used in sqlalchemy as is too - an update just allocates a new large object and unreferences the old one.
I wonder why the mechanics of these functions can't be more cleanly wrapped into server-side stored procedures? If they can be distilled into simple insert_lob() update_lob() functions that would potentially be a better separation of duties.
--
You received this message because you are subscribed to a topic in the Google Groups "sqlalchemy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sqlalchemy/4TCg_TVLgHc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sqlalchemy+...@googlegroups.com.
I'd like to see explicit SQL, preferably in the form of a psycopg2 script that illustrates all the operations you wish to support and specifically how they must interact with the database.
Hand crafted inserts use server side lo_create(-1) (in sql) which allocates a new large object and returning clause to get the oid in one go back to the client side. Then I start using the lobject api on the returned oid.
I've attached example usages. As I indicated in my prior email, right now I only do inserts/selects.