is there a reason why lower() isn't a column operation ?

845 views
Skip to first unread message

Jonathan Vanasco

unread,
Aug 13, 2012, 1:44:24 PM8/13/12
to sqlalchemy
just wondering why i have to do :
sqlalchemy.sql.func.lower( class.column ) == string.lower()

instead of :
class.column.lower() = string.lower()

btw - i know in the archives people have mentioned doing an ilike
search, but a "WHERE lower(column) = 'string'" search will search
against a functioned index on postgres and I believe oracle.
considerably faster on large data sets.

Michael Bayer

unread,
Aug 13, 2012, 2:37:17 PM8/13/12
to sqlal...@googlegroups.com
There's hundreds of SQL functions that accept a single expression as an argument. SQLAlchemy's "func" system is designed so that in the vast majority of cases, these functions aren't explicit in SQLAlchemy itself, func.<NAME> just makes a new Function object with the name "<NAME>".

So promoting some subset of those functions to be directly present would require a clear rationale as to why those SQL functions need to be present on the Column. Since they certainly can't all be. Also adding two ways to do the same thing needs a really good reason.

The particular lower() comparison you have there isn't necessarily very generic. Depending on the collation behavior of the database, it may not be necessary, or may not be enough to produce a faithful comparison if multibyte characters are in play. For example, comparing unicode strings on MySQL (http://dev.mysql.com/doc/refman/5.0/en/charset-collate.html) might look like:

_latin1 'Müller' COLLATE latin1_german2_ci = k

that's a lot more involved than a simple lower() call. The real use case you're looking for here is "case insensitive comparison", perhaps an operator which just does "case insensitive compare" at once would be easier to use, and also easier to augment with custom behavior.

So there's a lot of questions to be asked about this use case, all of which is in light of that it is 100 times harder to remove a poorly considered feature than to add it.

As a total coincidence, I'm working on an extensible operator system for core right now. A user-defined "case insensitive compare" operation would be easy to define in this new system (and actually you could define one using comparator_factory at the ORM level right now anyway). So a comparison such as the above, which might have edge cases that can't be met generically, might be better approached as a user-applied recipe. That way you get the ease of use and full capability, without adding incomplete features to SQLA directly.






Jonathan Vanasco

unread,
Aug 14, 2012, 10:39:46 AM8/14/12
to sqlalchemy
well, that's a lot of good reasons!

Jonathan Vanasco

unread,
Aug 14, 2012, 10:46:02 AM8/14/12
to sqlalchemy
also, just to clarify - i didn't want a case insensitive compare but a
specific sql generated

just to give a quick illustration

With this table structure:

CREATE TABLE names (
id serial not null primary key ,
name varchar(255) not null
);
CREATE UNIQUE INDEX names_name_uidx ON names(lower(name));

Postgres and Oracle will optimize this search against the index:
"""SELECT * FROM names WHERE lower(name) = '%s'""" %
search.lower()

Doing an ilike or similar won't work -- while the effect is a "case
insensitive search", the search is actually case sensitive.

The nice things about function based indexes like this, is that you
can both preserve uniqueness on a string ( usernames, email addresses,
etc ) and also gain a little speed on your most often used lookups.

Michael Bayer

unread,
Aug 14, 2012, 10:49:10 AM8/14/12
to sqlal...@googlegroups.com
in 0.8 you'll be able to add whatever functions and operators to Column and friends that you'd like, specific to types. All three email threads I dealt with yesterday regarded the ability to associate new operators with types so that feature has been added as of yesterday, just needs more docs.



Reply all
Reply to author
Forward
0 new messages