PostgreSQL hstore custom type?

488 views
Skip to first unread message

Kyle Schaffrick

unread,
Apr 7, 2010, 12:38:32 AM4/7/10
to sqlal...@googlegroups.com
Greetings,

I'm looking into using PostgreSQL's hstore type in a SQLAlchemy
project, and before I possibly reinvent the wheel I was wondering if
anyone has/knows of an implementation of an hstore custom type for SQLA?

I'm basically just interested in simply mapping a stand-alone attribute
containing a python dict onto an hstore column, I don't require
auto-magic storage of extra attributes directly added to the object
a la examples/vertical.

Mainly I want the hstore segregated into it's own attribute because I'd
like to be able to expose expression language support for hstore's
operators (?, ->, ||, and so forth) to manipulate them server-side.

Any pointers?

Thanks,
-Kyle

Michael Bayer

unread,
Apr 7, 2010, 10:15:14 AM4/7/10
to sqlal...@googlegroups.com

start with 0.6, as we've expanded the capability for types to directly
affect how various operators are rendered, as well as to drive the return
type of expressions, which in turn allows that expression to have a say in
its own usage in another expression, as well as affects result row
behavior.

if hstore represents a python value that is "mutable" (which a dict would
be), mixing in MutableType will cause the ORM to maintain a copy of the
original version of the attribute in order to check for changes.


>
> Thanks,
> -Kyle
>
> --
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To post to this group, send email to sqlal...@googlegroups.com.
> To unsubscribe from this group, send email to
> sqlalchemy+...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/sqlalchemy?hl=en.
>
>

Kyle Schaffrick

unread,
Apr 7, 2010, 6:10:53 PM4/7/10
to sqlal...@googlegroups.com
On Wed, 7 Apr 2010 10:15:14 -0400
"Michael Bayer" <mik...@zzzcomputing.com> wrote:

> Kyle Schaffrick wrote:
> > Greetings,
> >
> > I'm looking into using PostgreSQL's hstore type in a SQLAlchemy
> > project, and before I possibly reinvent the wheel I was wondering if
> > anyone has/knows of an implementation of an hstore custom type for
> > SQLA?
> >
> > I'm basically just interested in simply mapping a stand-alone
> > attribute containing a python dict onto an hstore column, I don't
> > require auto-magic storage of extra attributes directly added to
> > the object a la examples/vertical.
> >
> > Mainly I want the hstore segregated into it's own attribute because
> > I'd like to be able to expose expression language support for
> > hstore's operators (?, ->, ||, and so forth) to manipulate them
> > server-side.
> >
> > Any pointers?
>
> start with 0.6, as we've expanded the capability for types to directly
> affect how various operators are rendered, as well as to drive the
> return type of expressions, which in turn allows that expression to
> have a say in its own usage in another expression, as well as affects
> result row behavior.
>

I had planned on using 0.6 just to get the benefit of the new
dialect+dbapi system.

I did see the @compiles() idiom in some example code, is this what
you're referring to? It looks like a great new feature.

Have there been any thoughts on user-extensible reflection? Some way
to say "when you reflect a DB column of type foo, use this custom type
class"? If I got the right impression from skimming, dispatching
reflected types is essentially a big cascade of if-statements at the
moment. I don't really know how many DBMS's in SA's arsenal have
extensible type systems, so not sure if it's worth the trouble. I
assume you can just override column information piecemeal without
having to give up reflection of columns with vanilla types.

> if hstore represents a python value that is "mutable" (which a dict
> would be), mixing in MutableType will cause the ORM to maintain a
> copy of the original version of the attribute in order to check for
> changes.
>

Good information, thanks. I can see why this would need to be handled
specially.

-Kyle

David Gardner

unread,
May 5, 2010, 6:01:08 PM5/5/10
to sqlal...@googlegroups.com
I was just considering something similar. Were you able to get far with
this?
--
David Gardner
Pipeline Tools Programmer
Jim Henson Creature Shop
dgar...@creatureshop.com

Kyle Schaffrick

unread,
May 5, 2010, 11:46:09 PM5/5/10
to sqlal...@googlegroups.com
On Wed, 5 May 2010 15:01:08 -0700
David Gardner <dgar...@creatureshop.com> wrote:

> I was just considering something similar. Were you able to get far
> with this?
>

Yes, actually I've got a pretty good start on it. The only obvious
thing missing right now is a Comparator implementation so that the
custom hstore operations are available to mapped classes and not just
to the SQL expression language.

I did want to have access to hstore operations outside of mapped
classes (i.e. just using SQL expression language) so I ended up
implementing it slightly differently than I had originally thought. It
required some hacks which I'm not sure how to do better.

Some thoughts...

1. While you can override existing operators that work on ColumnElements
without doing much funny business, if you want to add *new* operations
to it, the abstractions leak fairly badly. This seems to be because
operator definitions aren't delegated to the class representing the
type, which is suboptimal because I would think that the type of a
ColumnElement is what logically "defines" what operations are valid on
it. The result of this is that I have to create classes like
HStoreColumn, HStoreColumnElement, _HStoreDeleteFunction, and so on, so
that SQL expressions which are logically of type 'hstore' will have the
extended hstore operations available.

2. That expression operations on Foo.some_col and foo_table.c.some_col
take completely different paths in the implementation was slightly
surprising. I would have expected the former to be implemented in terms
of the latter, so that SQL expressions available on some column type are
automatically available on the descriptor of a class which maps to that
column. But I don't know, there might be good reasons for this. In any
case I'm trying to figure out how to write my Comparator for hstore
without repeating myself a lot.

Current implementation attached, with really hacky tests at the end :)
Comments welcome.

-Kyle
hstore.py

Michael Bayer

unread,
May 6, 2010, 10:39:28 AM5/6/10
to sqlal...@googlegroups.com

On May 5, 2010, at 11:46 PM, Kyle Schaffrick wrote:

> 1. While you can override existing operators that work on ColumnElements
> without doing much funny business, if you want to add *new* operations
> to it, the abstractions leak fairly badly. This seems to be because
> operator definitions aren't delegated to the class representing the
> type, which is suboptimal because I would think that the type of a
> ColumnElement is what logically "defines" what operations are valid on
> it. The result of this is that I have to create classes like
> HStoreColumn, HStoreColumnElement, _HStoreDeleteFunction, and so on, so
> that SQL expressions which are logically of type 'hstore' will have the
> extended hstore operations available.

yeah I can see how it could go that the entire "comparator" interface moves onto TypeEngine. There are obviously dozens of major design decisions which would have to occur for that to happen and there may also be tradeoffs.

>
> 2. That expression operations on Foo.some_col and foo_table.c.some_col
> take completely different paths in the implementation was slightly
> surprising. I would have expected the former to be implemented in terms
> of the latter, so that SQL expressions available on some column type are
> automatically available on the descriptor of a class which maps to that
> column. But I don't know, there might be good reasons for this. In any
> case I'm trying to figure out how to write my Comparator for hstore
> without repeating myself a lot.

Well consider that Foo.some_col does a lot more than a Column object does. Ultimately its a descriptor object from the attributes package, and this object is also used for non-column attributes as well. I.e. it would be less consistent for a mapped object to look like:

class Foo(object):

id = < column object>
related_stuff = <instrumented attribute>
id_synonym = <instrumented attribute>

we try to make the Column and the InstrumentedAttribute systems similar by both subclassing expression.ColumnOperators, which is where the column-oriented comparisons occur. Ultimately when you say Foo.some_col == 'foo', it is getting down to the __eq__() method on foo_table.c.some_col, so ORM comparators are certainly implemented in terms of the SQL constructs, just maybe theres more indirection than you would initially think necessary. In GOF speak the InstrumentedAttribute is a "decorator" (not in the Python sense) - an object that proxies requests to another, while adding other functionality.

Kyle Schaffrick

unread,
May 6, 2010, 8:49:47 PM5/6/10
to sqlal...@googlegroups.com
On Thu, 6 May 2010 10:39:28 -0400
Michael Bayer <mik...@zzzcomputing.com> wrote:

>
> On May 5, 2010, at 11:46 PM, Kyle Schaffrick wrote:
>
> > 1. While you can override existing operators that work on
> > ColumnElements without doing much funny business, if you want to
> > add *new* operations to it, the abstractions leak fairly badly.
> > This seems to be because operator definitions aren't delegated to
> > the class representing the type, which is suboptimal because I
> > would think that the type of a ColumnElement is what logically
> > "defines" what operations are valid on it. The result of this is
> > that I have to create classes like HStoreColumn,
> > HStoreColumnElement, _HStoreDeleteFunction, and so on, so that SQL
> > expressions which are logically of type 'hstore' will have the
> > extended hstore operations available.
>
> yeah I can see how it could go that the entire "comparator" interface
> moves onto TypeEngine. There are obviously dozens of major design
> decisions which would have to occur for that to happen and there may
> also be tradeoffs.
>

Yeah it would be an invasive refactoring, no arguing that. And I'm not
sure it would have to be specifically that the whole thing moves to
TypeEngine, just that the ColumnElement would allow the type to provide
a (possibly empty) set of extended or overridden operations or methods
which are valid on expressions of that type. Maybe with a nested class?
I dunno.

I did noticed that GeoAlchemy ran into this same problem too when
adding support for custom expression methods that apply GIS operators,
since I actually consulted their code to see how they solved it.

On the plus side it feels like things are moving in this direction
anyway, with _adapt_expression() now consulted to compute the effective
type of an operator expression. While this currently appears to be for
the sole purpose of selecting the proper bind_processor or
result_processor, it seems like it would be possible (not to be
confused with "easy") to use this inferred type data to control the
dispatch of expression operations as well.
Yeah I am definitely not lobbying for an inconsistency here where
InstrumentedAttribute for plain columns goes away, I feel pretty aware
of what it adds to the equation. I guess what I was thinking/expecting
was more like, SQL expressions involving InstrumentedAttributes would do
their thing by proxying the operation to the underlying Column
*instance*, such that in the case that it was some quack-alike object
(an HStoreColumn, or a GeometryColumn), it would Just Work.

If you're saying this is how it works currently then I'll shut up
now :) I tried to trace it but it was a lot to keep in my head and I
almost certainly missed something. I do know that it's not aware of any
custom methods on the underlying Column, since has_key(), dissoc(),
assoc() and so on do not work without a custom comparator. Since I
don't have a good idea how to make that work, don't take it as a "you
should fix this" -- last thing I want to do is criticize the design
without proposing a solution!

Anyway, thanks for the thoughts.

-Kyle

David Gardner

unread,
May 7, 2010, 12:01:05 PM5/7/10
to sqlal...@googlegroups.com
Thank you, I haven't started using hstore in my production environment
yet, but wanted to do some tests with it as a way for users to attach
arbitrary key/value metadata to nodes. Are you currently using a
Gin or Gist index on your hstore columns?
David Gardner
Pipeline Tools Programmer
Jim Henson Creature Shop
dgar...@creatureshop.com


Kyle Schaffrick

unread,
May 7, 2010, 6:55:53 PM5/7/10
to sqlal...@googlegroups.com
On Fri, 7 May 2010 09:01:05 -0700
David Gardner <dgar...@creatureshop.com> wrote:

> Thank you, I haven't started using hstore in my production environment
> yet, but wanted to do some tests with it as a way for users to attach
> arbitrary key/value metadata to nodes. Are you currently using a
> Gin or Gist index on your hstore columns?

This is basically what I'm doing with it, tracking arbitrary
metadata-ish things on various objects, although at this point it's for
internal use like flags and whatnot, but I expect it will extend to
user supplied data as well, in the fullness of time.

The application for which I wrote this is also not in production yet
either, so experimenting with indexes would of course be futile right
now.

That said, I'm not sure what the practical difference is between
choosing GiST or GiN on an hstore, they both support the same set of
operators: "contains" (@>), "has key" (?), and in 9.0 the new "has all
keys" (?&) and "has any key" (?|) operators.

Speaking of which I should probably add support for the crap-ton of new
hstore features that come out in 9.0:

http://developer.postgresql.org/pgdocs/postgres/hstore.html

FWIW, I do know that GiST support in PostGIS's geometry type is stupid
fast. On a different project we're getting like 10-40msec query times to
test for polygon intersection against a table of ~200k polygonal
geographic objects on a dirt-cheap SATA dedibox, even which it picks up
hundreds of matches. My expectations are high for hstore's index
performance :)

-Kyle

Uwe Seher

unread,
Mar 29, 2012, 4:01:28 AM3/29/12
to sqlal...@googlegroups.com
Hi!

I am exploring your hstore.py for some days now. At first, i fixed an error which occured when the hstore-field in the database was empty (see attachment). At next i found out, that keys in the dictictionary, that shall be written into the database must be strings and not unicode. Is this the correct handling of this?

Thank you for the work!

Uwe Seher
hstore.py

Michael Bayer

unread,
Mar 29, 2012, 10:18:03 AM3/29/12
to sqlal...@googlegroups.com
Hey this is very nice, a few comments:

1. MutableType is probably going away in 0.8, and even the presence of it causes the ORM to perform very poorly when objects use a MutableType.   We have sqlalchemy.ext.mutable now as the alternative.
2. was going to say something about HStoreColumn(HStoreElement, Column) -> HStoreElement(ColumnElement)/Column(ColumnElement), but then I tried out a quick diamond example in Python and I learned something I wasn't clear on earlier about multiple inheritance (Column stays on top of ColumnElement) so thanks for that !
3. Would HStoreColumn already imply the HStore type ?  If you give it a custom __init__, you'd probably also need a _constructor()  method which is used by Column to help when it needs to make copies of itself.



--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/U6JuhiAOJQMJ.

To post to this group, send email to sqlal...@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
<hstore.py>

Kyle Schaffrick

unread,
Apr 15, 2012, 8:32:32 AM4/15/12
to sqlal...@googlegroups.com
Wow, a blast from the past! :)  

Which version of SQLAlchemy does your new version work with? It's great that you are now able to get rid of the HStoreComparator, that really makes it easier to use. The version of SQLA I was using 2 years ago when I wrote it required the custom comparator to make the operators work on the ORM InstrumentedAttribute as well as on the Core Column. I suppose the new version doesn't require this duplication?

On your question, I think that PostgreSQL's hstore supports unicode just fine; the keys and values are of type "text". It is just a bug in my code that it does not work. I think the main issue would be to replace .encode('string_escape') in _serialize_hstore() and .decode('string_escape') in _parse_hstore() with something that will do the same job but work for both unicode and string, and to make sure the regexes are unicode safe (or replace them with a proper parser).

I'm glad you're finding it useful, hstore is very handy and having it work seamlessly in SQLA is even better!

--

Kyle Schaffrick

unread,
May 15, 2012, 3:24:40 AM5/15/12
to sqlal...@googlegroups.com
On Mon, Apr 30, 2012 at 8:52 AM, Adam Tauno Williams <awil...@whitemice.org> wrote:
On Sun, 2012-04-15 at 05:32 -0700, Kyle Schaffrick wrote:
> Wow, a blast from the past! :)

Yes.


> Which version of SQLAlchemy does your new version work with? It's
> great that you are now able to get rid of the HStoreComparator, that
> really makes it easier to use. The version of SQLA I was using 2 years
> ago when I wrote it required the custom comparator to make the
> operators work on the ORM InstrumentedAttribute as well as on the Core
> Column. I suppose the new version doesn't require this duplication?

I've gone back through the thread and I can't find a link to your work
[regarding the hstore implementation].  Can you post one?


Sure, my original version was the attachment here: http://groups.google.com/group/sqlalchemy/msg/6f7284341ab69164

Uwe's modified version is here, http://groups.google.com/group/sqlalchemy/msg/03250e20866ea253 -- it is almost certainly better since it removes the ugly custom comparator which I had to use before in order to make the InstrumentedAttribute have the same operations available in the ORM as the raw Column does. I assume it still works the same but is cleaner now that it's updated to work on a more recent SQLA version (although I haven't tried it yet myself).

Adam Tauno Williams

unread,
May 21, 2012, 11:25:29 AM5/21/12
to sqlal...@googlegroups.com
On Tue, 2012-05-15 at 00:24 -0700, Kyle Schaffrick wrote:
> On Mon, Apr 30, 2012 at 8:52 AM, Adam Tauno Williams

> Sure, my original version was the attachment
> here: http://groups.google.com/group/sqlalchemy/msg/6f7284341ab69164

FYI, for anyone looking for hstore.py in the future I've put a copy in
my SourceForge Hg repository (to have a not-in-list-archive copy).

<https://sourceforge.net/u/whitemice/whitemicehg/ci/b6fb4d699f513d53ac058dcbb932d47cad0ddafe/tree/Scraps/SQLAlchemy/hstore.py>

I'll probably hack on it a bit.


signature.asc
Reply all
Reply to author
Forward
0 new messages