postgresql large objects

726 views
Skip to first unread message

Jason Newton

unread,
Jun 25, 2014, 2:35:44 AM6/25/14
to sqlal...@googlegroups.com
Hi,

I'm wondering how I might get at postgresql's large object type (lob).  It's not to be confused with the TOASTED bytea, which are limited currently to 1 GiB yet in practice is much lower (for me <400MiB)  - it's a special table + api designed to handle very large binary objects, like a few hundred MiBs to more recently TiBs. I don't see appropriate definitions anywhere and can't find any mention of it really with sqlalchemy. psycopg2 has support for it and calls it lobject, it provides a file like interface to the lob which is a good mapping since with libpgsql you use lo_creat, lo_seek, lo_write, lo_read to work with these beasts

I took a look at UserDefinedType but on the bind_processor, this doesn't distinguish between inserts and updates.  With inserts, you'd use an oid allocated from lo_creat in the transaction.  On updates, you'd use lo_trunc/lo_write.  As one more constraint, you must be in a transaction before any of these functions are usable.  To reference large objects, as they are explicitly an out of table storage, the postgresql specific oid is used (which allows garbage collection, referential integrity checks etc).

I'll also mention that other tables reference these large objects via oids, something like smart pointers in postgres.

It'd be great to plug large objects into sqlalchemy properly - but can it be done?

-Jason

Mike Bayer

unread,
Jun 25, 2014, 9:46:39 AM6/25/14
to sqlal...@googlegroups.com
well we just added the OID type in 0.9.5, so you at least have that.

The PG LOB feature is very sparsely documented - on PG's docs, they only seem to document the C API (www.postgresql.org/docs/9.2/static/lo-interfaces.html), feel free to point me to better documentation on this.   

As far as transparently embedding this into INSERT/UPDATE, my first thought is that this might be unwise considering how far removed these functions are from any kind of mainstream usage in such statements - particularly if separate statements have to be called per value to get at OIDs or similar.    That PG's docs barely mention this whole feature much less any kind of regular SQL integration is a red flag.   PG's BYTEA type is already arbitrarily large so there is probably not much interest in a type like this.    If it's the "streaming" feature you're looking for, SQLA's usual approach such as that of Oracle's LOB is to "pre-stream" it on the result set side (necessary, because fetching a batch of rows requires it), and cx_Oracle doesn't provide too much option to stream on the write side.   I've dealt a lot with "streaming" datatypes back in the day but sometime in the early 2000's everyone just stopped using them.

As the feature involves SQL functions I don't think you'd be dealing only with bind_processor(), the SQL functions themselves would probably be via SQL-level processing, see http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing.    It's true these functions aren't given access to the compiler context where you'd be able to discern INSERT from UPDATE, so I'd probably use two variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd apply these to an INSERT or UPDATE statement at runtime probably using a @compiles decorator - http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take the existing LOB type and just swap it out for InsertLOB/UpdateLOB.    Adding compiler context to TypeEngine is something we could look into as a feature as well so that bind_expression() has this available somehow and switching around types wouldn't be needed.

building this would involve stringing together hooks that are not quite set up for this, however for me to really assist here I'd need to see exact examples of what INSERT, UPDATE and anything else looks like in conjunction with these functions.

I would note that psycopg2 itself also provides for extension types, including custom Connection and Cursor subclasses.   If a lot of things have to happen when these types are in play it might be something that can occur at that level, PG's type API is obviously a lot more PG specific.




-Jason
--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Jason Newton

unread,
Jun 25, 2014, 2:53:52 PM6/25/14
to sqlal...@googlegroups.com
Hi,

I've replied inline below.


On Wed, Jun 25, 2014 at 6:46 AM, Mike Bayer <mik...@zzzcomputing.com> wrote:

well we just added the OID type in 0.9.5, so you at least have that.

I came across the entry on the issue tracker a little bit after submitting.  As usual for me, it's support wasn't added very long ago.


The PG LOB feature is very sparsely documented - on PG's docs, they only seem to document the C API (www.postgresql.org/docs/9.2/static/lo-interfaces.html), feel free to point me to better documentation on this.

There's also server side functions, they operate almost exactly the same as client side api: http://www.postgresql.org/docs/9.3/static/lo-funcs.html . There is no better documentation than those two that I know of, but they were sufficient for me.


As far as transparently embedding this into INSERT/UPDATE, my first thought is that this might be unwise considering how far removed these functions are from any kind of mainstream usage in such statements - particularly if separate statements have to be called per value to get at OIDs or similar.    That PG's docs barely mention this whole feature much less any kind of regular SQL integration is a red flag.   PG's BYTEA type is already arbitrarily large so there is probably not much interest in a type like this.    If it's the "streaming" feature you're looking for, SQLA's usual approach such as that of Oracle's LOB is to "pre-stream" it on the result set side (necessary, because fetching a batch of rows requires it), and cx_Oracle doesn't provide too much option to stream on the write side.   I've dealt a lot with "streaming" datatypes back in the day but sometime in the early 2000's everyone just stopped using them.

Seems to fly in the face at the point of SQLA although integration difficulties are appreciated.  Most advanced postgresql drivers in any language bindings have added support for this type although none of them can hide that it's file like.  PG's BYTEA is NOT arbitrarily large, it has quite real practical limits and I've hit them regularly in storing compressed HDF5 documents in the database as part of a bench testing framework.  The theoretical limit is 1GB but this limit is far less in practice (http://www.postgresql.org/message-id/CAFj8pRAcfKoiNp2uXeiZOd5k...@mail.gmail.com ) . http://www.postgresql.org/docs/9.3/static/storage-toast.html  I'm not looking for streaming, retrieve/set binary buffers.  It'd be nice to translate it transparently to HDF5 python in-memory objects (ie h5py).
 

As the feature involves SQL functions I don't think you'd be dealing only with bind_processor(), the SQL functions themselves would probably be via SQL-level processing, see http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing.    It's true these functions aren't given access to the compiler context where you'd be able to discern INSERT from UPDATE, so I'd probably use two variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd apply these to an INSERT or UPDATE statement at runtime probably using a @compiles decorator - http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take the existing LOB type and just swap it out for InsertLOB/UpdateLOB.    Adding compiler context to TypeEngine is something we could look into as a feature as well so that bind_expression() has this available somehow and switching around types wouldn't be needed.

Separate datatypes is clearly not a good approach to this.


building this would involve stringing together hooks that are not quite set up for this, however for me to really assist here I'd need to see exact examples of what INSERT, UPDATE and anything else looks like in conjunction with these functions.

Everything starts with a transaction block.  Hand crafted inserts use server side lo_create(-1) (in sql) which allocates a new large object and returning clause to get the oid in one go back to the client side.  Then I start using the lobject api on the returned oid.   In patches I've made to cppdb, in the bind of std::istream I call client side lo_create, lo_open, a sequence of lo_writes, and finally lo_close.  In retrospect this doesn't handle updates particularly well and maybe this could be used in sqlalchemy as is too - an update just allocates a new large object and unreferences the old one.  This would leave it up to vacuumlo to GC those "updated"  lobs.  In my work load however, my lobs so far are immutable - they are results of simulations and this is the way I've worked to date.  It probably won't stay that way forever.

Mike Bayer

unread,
Jun 25, 2014, 3:40:12 PM6/25/14
to sqlal...@googlegroups.com

On 6/25/14, 2:53 PM, Jason Newton wrote:
Seems to fly in the face at the point of SQLA although integration difficulties are appreciated.  Most advanced postgresql drivers in any language bindings have added support for this type although none of them can hide that it's file like. 
SQLA doesn't go out of its way for unusual, very database-specific features that up til now have demonstrated little to no real-world use (nobody's ever asked for this feature before and googling about Postgresql LOBs turns up very little).  There are tons of examples of features like this across many different database backends.  If they are easy to add, we add them, or if they are easy to address via a recipe, we add the recipe up to the wiki.

But if the behavior requires substantial changes to the core and dialect, and the ratio of complexity of impact to sparseness of need is really high, it's not worth it and actually kind of damaging to most users to complicate the library for use cases that are extremely rare and can just as well be addressed by dropping down to raw DBAPI code.  Complications/destabiliziations/performance degradations that are hoisted onto the whole userbase for the benefit of a single feature that is virtually never needed is the wrong choice to make; I'm presented with this choice all the time and there's nearly always work to be done in extricating ill-conceived features and behaviors that went in too quickly.  I'm pretty confident that this feature won't require any of that, but that remains to be seen.

Just in case this wasn't apparent, you certainly *can* use psycopg2's bindings when you're in an otherwise SQLAlchemy app.  Worst case you can retrieve a raw psycopg2 connection using connection.raw_connection and do whatever you need.  If you truly have some use for LOBs, SQLAlchemy isn't preventing you from using it, it's just not providing any niceties around it.   The fact that these unusual use cases are not ever prohibited by SQLA further raises the bar to adding first class support for them.



PG's BYTEA is NOT arbitrarily large, it has quite real practical limits and I've hit them regularly in storing compressed HDF5 documents in the database as part of a bench testing framework.  The theoretical limit is 1GB but this limit is far less in practice (http://www.postgresql.org/message-id/CAFj8pRAcfKoiNp2uXeiZOd5k...@mail.gmail.com ) . http://www.postgresql.org/docs/9.3/static/storage-toast.html  I'm not looking for streaming, retrieve/set binary buffers.  It'd be nice to translate it transparently to HDF5 python in-memory objects (ie h5py).
Python translation is very easy in SQLA, its just if you have special needs for SQL syntaxes, that's where special behaviors may be needed.  So far it sounds like the only blocking factor is that bind_sql needs to distinguish between INSERT and UPDATE.  that's not a terribly tall order though it is inconvenient in that the API would need a backwards-compatibility layer.



 

As the feature involves SQL functions I don't think you'd be dealing only with bind_processor(), the SQL functions themselves would probably be via SQL-level processing, see http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing.    It's true these functions aren't given access to the compiler context where you'd be able to discern INSERT from UPDATE, so I'd probably use two variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd apply these to an INSERT or UPDATE statement at runtime probably using a @compiles decorator - http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take the existing LOB type and just swap it out for InsertLOB/UpdateLOB.    Adding compiler context to TypeEngine is something we could look into as a feature as well so that bind_expression() has this available somehow and switching around types wouldn't be needed.

Separate datatypes is clearly not a good approach to this.

Let me clarify that these separate datatypes would be totally invisible to the user.   The user would work with a single LOB type.  Translation to Insert/Update versions would not be explicit and would occur at the point at which the insert/update construct is compiled.   At the moment this may be the best approach short of modifying the library (but then again I can't say much about the approach because i have little idea what the SQL we're talking about looks like).

 
Everything starts with a transaction block. 
the psycopg2 DBAPI is implicitly in a transaction block at all times unless "autocommit" mode is set up, so there shouldn't be any problem here.


Hand crafted inserts use server side lo_create(-1) (in sql) which allocates a new large object and returning clause to get the oid in one go back to the client side.  Then I start using the lobject api on the returned oid. 
I'd like to see explicit SQL, preferably in the form of a psycopg2 script that illustrates all the operations you wish to support and specifically how they must interact with the database.



In patches I've made to cppdb, in the bind of std::istream I call client side lo_create, lo_open, a sequence of lo_writes, and finally lo_close.  In retrospect this doesn't handle updates particularly well and maybe this could be used in sqlalchemy as is too - an update just allocates a new large object and unreferences the old one. 
I wonder why the mechanics of these functions can't be more cleanly wrapped into server-side stored procedures?    If they can be distilled into simple insert_lob() update_lob() functions that would potentially be a better separation of duties.


Jason Newton

unread,
Jun 25, 2014, 5:36:25 PM6/25/14
to sqlal...@googlegroups.com
Just in case this wasn't apparent, you certainly *can* use psycopg2's bindings when you're in an otherwise SQLAlchemy app.  Worst case you can retrieve a raw psycopg2 connection using connection.raw_connection and do whatever you need.  If you truly have some use for LOBs, SQLAlchemy isn't preventing you from using it, it's just not providing any niceties around it.   The fact that these unusual use cases are not ever prohibited by SQLA further raises the bar to adding first class support for them.

Yes, this much I knew would work, its just clunkier than it could be.

 


PG's BYTEA is NOT arbitrarily large, it has quite real practical limits and I've hit them regularly in storing compressed HDF5 documents in the database as part of a bench testing framework.  The theoretical limit is 1GB but this limit is far less in practice (http://www.postgresql.org/message-id/CAFj8pRAcfKoiNp2uXeiZOd5k...@mail.gmail.com ) . http://www.postgresql.org/docs/9.3/static/storage-toast.html  I'm not looking for streaming, retrieve/set binary buffers.  It'd be nice to translate it transparently to HDF5 python in-memory objects (ie h5py).
Python translation is very easy in SQLA, its just if you have special needs for SQL syntaxes, that's where special behaviors may be needed.  So far it sounds like the only blocking factor is that bind_sql needs to distinguish between INSERT and UPDATE.  that's not a terribly tall order though it is inconvenient in that the API would need a backwards-compatibility layer.

Perhaps you can keep backwards compatibility by introspecting the callbacks to determine the arity of their arguments.
 



 

As the feature involves SQL functions I don't think you'd be dealing only with bind_processor(), the SQL functions themselves would probably be via SQL-level processing, see http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#applying-sql-level-bind-result-processing.    It's true these functions aren't given access to the compiler context where you'd be able to discern INSERT from UPDATE, so I'd probably use two variants of datatype, an InsertLOB and UpdateLOB datatype, then I'd apply these to an INSERT or UPDATE statement at runtime probably using a @compiles decorator - http://docs.sqlalchemy.org/en/rel_0_9/core/compiler.html, that is, take the existing LOB type and just swap it out for InsertLOB/UpdateLOB.    Adding compiler context to TypeEngine is something we could look into as a feature as well so that bind_expression() has this available somehow and switching around types wouldn't be needed.

Separate datatypes is clearly not a good approach to this.

Let me clarify that these separate datatypes would be totally invisible to the user.   The user would work with a single LOB type.  Translation to Insert/Update versions would not be explicit and would occur at the point at which the insert/update construct is compiled.   At the moment this may be the best approach short of modifying the library (but then again I can't say much about the approach because i have little idea what the SQL we're talking about looks like).

Ah.  I thought it would be done in the declarative table descriptions.   I'm not sure, being relatively new to SQLA, that would look like to do.


 
Everything starts with a transaction block. 
the psycopg2 DBAPI is implicitly in a transaction block at all times unless "autocommit" mode is set up, so there shouldn't be any problem here.

AFAIK it won't work with autocommit but I don't have documentation to prove that, I just remember getting errors in I think JDBC pertaining to it.
 

Hand crafted inserts use server side lo_create(-1) (in sql) which allocates a new large object and returning clause to get the oid in one go back to the client side.  Then I start using the lobject api on the returned oid. 
I'd like to see explicit SQL, preferably in the form of a psycopg2 script that illustrates all the operations you wish to support and specifically how they must interact with the database.


I've attached example usages.  As I indicated in my prior email, right now I only do inserts/selects.
 

In patches I've made to cppdb, in the bind of std::istream I call client side lo_create, lo_open, a sequence of lo_writes, and finally lo_close.  In retrospect this doesn't handle updates particularly well and maybe this could be used in sqlalchemy as is too - an update just allocates a new large object and unreferences the old one. 
I wonder why the mechanics of these functions can't be more cleanly wrapped into server-side stored procedures?    If they can be distilled into simple insert_lob() update_lob() functions that would potentially be a better separation of duties.


I think it's because only "small" lobs could rely on that -  huge lobs that exceed working memory fit better with a file like api.  It would be a nice convenience to have, perhaps.  These same "small lobs" (ie lobs that fit in working memory - keeping in mind modern commodity servers easily get 64GB) are the kind that I would expect SQLA to deal with although if they binded file like objects and buffers, that would allow both cases to be covered.  I don't think I'll be writing code that uses those stored procedures but they could be nice conveniences.  If SQLA managed to support these lobs well however, I don't think users would use it directly.
 

--
You received this message because you are subscribed to a topic in the Google Groups "sqlalchemy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sqlalchemy/4TCg_TVLgHc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sqlalchemy+...@googlegroups.com.
lob_examples.py

Mike Bayer

unread,
Jun 25, 2014, 6:30:49 PM6/25/14
to sqlal...@googlegroups.com

On 6/25/14, 5:35 PM, Jason Newton wrote:

 

Hand crafted inserts use server side lo_create(-1) (in sql) which allocates a new large object and returning clause to get the oid in one go back to the client side.  Then I start using the lobject api on the returned oid. 
I'd like to see explicit SQL, preferably in the form of a psycopg2 script that illustrates all the operations you wish to support and specifically how they must interact with the database.


I've attached example usages.  As I indicated in my prior email, right now I only do inserts/selects.
Here's my immediate thought about the INSERT (and the UPDATE) - the first way is using events, the second would attempt to move this system into something more native to the psycopg2 dialect:

Use the before_cursor_execute() and after_cursor_execute() events to get at this:

http://docs.sqlalchemy.org/en/latest/core/events.html#sqlalchemy.events.ConnectionEvents.before_cursor_execute
http://docs.sqlalchemy.org/en/latest/core/events.html#sqlalchemy.events.ConnectionEvents.after_cursor_execute

1. Create a new type called PGLOB or whatever you want.  For starters, this can just be a blank UserDefinedType subclass.

in before_cursor_execute:

2. determine if INSERT or UPDATE using context.isinsert, context.isupdate

3. examine the datatypes that SQLAlchemy is using here, by looking at context.compiled.binds.  The values here are BindParameter objects, you want to look in those for the ones that are of type PGLOB.

4. From context.compiled.binds, you have the names of the bound params with the type.  Search and replace the "statement" for occurrences of that bound parameter, replace with "lo_creat(-1)" or whatever you need there.

5. Also append to the RETURNING clause those cols you need.

6. the statement as a return value will be used, if you set up the event with retval=True (see the docs).

in after_cursor_execute:

7. in after_cursor_execute - call fetchone() to get the RETURNING values.  Get that OID you care about then do that work with conn.lobject and all that.  hopefully this doesnt mess up the existing cursor state.

8.  now the tricky part.  SQLAlchemy needs that row if you're doing "implicit returning" to get at primary key values.   psycopg2's cursor seems to have a scroll() method that works for client side cursors.   I'd scroll it back one so that SQLA gets the state it expects.

Alternative system, more plugged in:

1. We would be creating new features on sqlalchemy/dialects/postgresql/psycopg2.py -> PGExecutionContext_psycopg2.  Similar hooks are available here which you can use to accomplish similar tasks;  you'd want to look at the pre_exec(), post_exec() and possibly post_insert() methods, and maybe even _fetch_implicit_returning().    If SQLA were to support this more natively, things would be happening at this level.   But again, I really wouldn't want all kinds of hardcoded expectations of LOB objects in this object taking up processing time for the vast majority of use cases that don't use LOBs, so the extra logic here should be contained within something that can easily be placed aside based on options or similar.




Jonathan Vanasco

unread,
Jul 8, 2014, 7:27:27 PM7/8/14
to sqlal...@googlegroups.com
Working on this level of integration is way over my head, but I'd love to see this happen and wanted to quickly +1. 

I just had to refactor some code to store User Submitted Media into Postgres for background processing.  While doing a bit of research, I learned that the streaming functionality is rather important even on "small" files.  Postgres has to materialize the entire BYTEA type in memory when selecting; when inserting, a 2MB binary file can account for an 8MB encoded string.  According to the pg-users list, this can often max out memory allocations and/or seriously degrade performance.
Reply all
Reply to author
Forward
0 new messages