textcolumn.contains() escaping, the lack of icontains(), and possible bug?

357 views
Skip to first unread message

Daniel Grace

unread,
May 13, 2013, 6:30:05 PM5/13/13
to sqlal...@googlegroups.com
So today I identified a small bug in my code and then, while trying to resolve it, came to a few realizations:

1. column.contains(str) does not escape characters in str such as % and _.  Presumably, column.startswith(str) and column.endswith(str) have the same behavior.

2. There is a distinct lack of column.icontains(str), though the current implementation means it's identical to column.ilike('%' + str + '%')

3. There is no builtin function (that I found, please correct me if I'm wrong!) for escaping a string being passed to any functions in this family.

While I think that column.like and column.ilike should definitely /not/ escape their argument (you know you're trying for a pattern match here, and that you're matching against a pattern), I think that the .contains/.startswith/.endswith family of functions probably should perform this escaping transparently.  Between DBAPI 2.0, SQLAlchemy and parameterized querying I don't need to worry about escaping input, so why should I have to pay attention to that detail when using .contains?  Also, case insensitive versions of the above would probably be useful.

That said, a proper fix might be complicated since it could inadvertently break existing code that relies on the current behavior of .contains()

-- Daniel

Michael Bayer

unread,
May 13, 2013, 6:44:38 PM5/13/13
to sqlal...@googlegroups.com
On May 13, 2013, at 6:30 PM, Daniel Grace <thisgen...@gmail.com> wrote:

So today I identified a small bug in my code and then, while trying to resolve it, came to a few realizations:

1. column.contains(str) does not escape characters in str such as % and _.  Presumably, column.startswith(str) and column.endswith(str) have the same behavior.

this will be called "autoescape" and is ticket 2694: http://www.sqlalchemy.org/trac/ticket/2694 .    if someone wants to work on a patch for this it would be v. helpful.   It's a little late to turn on the escaping for all users now as it would break existing workarounds.



2. There is a distinct lack of column.icontains(str), though the current implementation means it's identical to column.ilike('%' + str + '%')

since we do have "ilike()" as an operator "icontains()" would be appropriate at this point (also startswith,endswith).



3. There is no builtin function (that I found, please correct me if I'm wrong!) for escaping a string being passed to any functions in this family.

will be 2694


While I think that column.like and column.ilike should definitely /not/ escape their argument (you know you're trying for a pattern match here, and that you're matching against a pattern), I think that the .contains/.startswith/.endswith family of functions probably should perform this escaping transparently.  Between DBAPI 2.0, SQLAlchemy and parameterized querying I don't need to worry about escaping input, so why should I have to pay attention to that detail when using .contains?  Also, case insensitive versions of the above would probably be useful.

That said, a proper fix might be complicated since it could inadvertently break existing code that relies on the current behavior of .contains()


-- Daniel


--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Daniel Grace

unread,
May 13, 2013, 6:58:45 PM5/13/13
to sqlal...@googlegroups.com
Good to hear!

I took a look at #2694 and it seems that using column.contains(other, autoescape=True)  might get wordy fairly quick when -- at least in new applications -- it would be a handy default.  While it's probably not particularly feasible, it'd be handy if the default for autoescape could somehow be set on a engine/metadata/etc level.

-- Daniel

Bobby Impollonia

unread,
May 16, 2013, 6:22:18 PM5/16/13
to sqlal...@googlegroups.com
Sounds like a useful feature.

Regarding case sensitivity, perhaps it would better if each of these methods (even like() and contains()) took a keyword argument along the lines of col.endswith('foo', case_sensitive=False) rather than adding extra methods with weird names like iendswith.

Michael Bayer

unread,
May 16, 2013, 7:49:49 PM5/16/13
to sqlal...@googlegroups.com
On May 16, 2013, at 6:22 PM, Bobby Impollonia <bob...@gmail.com> wrote:

Sounds like a useful feature.

Regarding case sensitivity, perhaps it would better if each of these methods (even like() and contains()) took a keyword argument along the lines of col.endswith('foo', case_sensitive=False) rather than adding extra methods with weird names like iendswith.

yes I've thought of that, though we already have "ilike()" for "like()", as this is a well known Postgresql operator, so I fear this might introduce some inconsistency. We'd really need to add "case_sensitive" to like() as well and everywhere, and just have "ilike()" be a synonym for "like(..., case_sensitive=False)".

Additionally, case_sensitive might need to default to "None" - in its absence, technically we don't know if the underlying database is doing case sensitive comparison or not.  Setting it to "True" means, "we will use known case-insensitive techniques for the target backend".   

But then what do we do on a backend that doesn't have "ilike"? do we raise an error?   what if some backend already has case-insensitive collation set up ?

Jonathan Vanasco

unread,
Sep 27, 2013, 12:39:19 PM9/27/13
to sqlal...@googlegroups.com
Any progress on considering:

* icontains
* istartswith
* iendswith

I ran into this again ( i had posted a similar request about a year or so ago )

re "But then what do we do on a backend that doesn't have "ilike"? do we raise an error? "

would this be possible:

   .filter( User.name.icontains('ADAM') )
   
   supports ilike -- """ WHERE name ilike '%ADAM%' """
   no ilike -- """ WHERE LOWER(name) like LOWER('%ADAM%') """ [ or """ WHERE LOWER(name) like '%adam%' """ ]


at least i think most backends support LOWER on evaluation.

Michael Bayer

unread,
Sep 27, 2013, 2:19:58 PM9/27/13
to sqlal...@googlegroups.com
yeah im not sure why I asked that, i wonder if the current ilike() does that anyway.

I'm +1 on "case_sensitive=None" by default, it should accept False, but i think perhaps it *shouldn't* accept "True" since we can't really control that.
signature.asc

Bobby Impollonia

unread,
Oct 5, 2013, 12:30:19 AM10/5/13
to sqlal...@googlegroups.com
Converting strings to lower case and comparing them is not the same as a true case-insensitive comparison. Python 3.3 adds a str.casefold method for this reason. The docs for that method give a good explanation of the distinction:

"""Casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string. For example, the German lowercase letter 'ß' is equivalent to "ss". Since it is already lowercase, lower() would do nothing to 'ß'; casefold() converts it to "ss".

The casefolding algorithm is described in section 3.13 of the Unicode Standard."""

Michael Bayer

unread,
Oct 19, 2013, 2:09:56 AM10/19/13
to sqlal...@googlegroups.com
I've also found http://www.w3.org/International/wiki/Case_folding as a great source of info of why SQLAlchemy itself shouldn't get into trying to provide case insensitive comparisons using lower() or similar. However, I have added a recipe for those who want to implement a simple lower()-based system, or one that uses some other SQL function, so that people can build the comparisons that suit their needs, that is at: http://www.sqlalchemy.org/trac/wiki/UsageRecipes/StringComparisonFilter .
> --
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
> To post to this group, send email to sqlal...@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy.
signature.asc
Reply all
Reply to author
Forward
0 new messages