Use regexp in like

1,127 views
Skip to first unread message

Michael Hipp

unread,
Sep 14, 2010, 3:23:31 PM9/14/10
to sqlal...@googlegroups.com
Is it possible to use a regexp in a like() clause? Or some other way to achieve
something similar?

Thanks,
Michael

Michael Hipp

unread,
Sep 17, 2010, 9:14:07 AM9/17/10
to sqlal...@googlegroups.com
On 9/14/2010 2:23 PM, Michael Hipp wrote:
> Is it possible to use a regexp in a like() clause? Or some other way to achieve
> something similar?

Can anyone suggest an approach to search a field with a regexp?

Thanks,
Michael

Michael Bayer

unread,
Sep 17, 2010, 10:58:57 AM9/17/10
to sqlal...@googlegroups.com

if you were using Postgresql, you could use somecolumn.op("~")(someregexp)

http://www.postgresql.org/docs/8.4/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP

cant speak for other platforms though you'd have to consult their documentation.

>
> Thanks,
> Michael
>
> --
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To post to this group, send email to sqlal...@googlegroups.com.
> To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
>

Michael Bayer

unread,
Sep 17, 2010, 11:12:30 AM9/17/10
to sqlal...@googlegroups.com

On Sep 17, 2010, at 10:58 AM, Michael Bayer wrote:

>
> On Sep 17, 2010, at 9:14 AM, Michael Hipp wrote:
>
>> On 9/14/2010 2:23 PM, Michael Hipp wrote:
>>> Is it possible to use a regexp in a like() clause? Or some other way to achieve
>>> something similar?
>>
>> Can anyone suggest an approach to search a field with a regexp?
>
> if you were using Postgresql, you could use somecolumn.op("~")(someregexp)
>
> http://www.postgresql.org/docs/8.4/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP
>
> cant speak for other platforms though you'd have to consult their documentation.


PG also supports MATCH since I see we have some unit tests for that, i.e. column.match(other). I'm not seeing offhand in PG's docs what it interprets the "MATCH" operator as, i.e. is it a "~", "SIMILAR TO", not sure.

I guess the reasons I've never had interest in regexp matching in databases are:

1. its always bad to search through tables without being able to use indexes
2. if you're needing to dig into text, it suggests the atoms of that text should be represented individually in their own column (i.e. normalize)
3. no really, I'm doing flat out full text searching on documents and don't want to reinvent. Well then I'd use a full blown text extension (http://www.postgresql.org/docs/8.3/static/textsearch.html) or a separate search engine.

A.M.

unread,
Sep 17, 2010, 11:29:47 AM9/17/10
to sqlal...@googlegroups.com

On Sep 17, 2010, at 11:12 AM, Michael Bayer wrote:

>
> On Sep 17, 2010, at 10:58 AM, Michael Bayer wrote:
>
>>
>> On Sep 17, 2010, at 9:14 AM, Michael Hipp wrote:
>>
>>> On 9/14/2010 2:23 PM, Michael Hipp wrote:
>>>> Is it possible to use a regexp in a like() clause? Or some other way to achieve
>>>> something similar?
>>>
>>> Can anyone suggest an approach to search a field with a regexp?
>>
>> if you were using Postgresql, you could use somecolumn.op("~")(someregexp)
>>
>> http://www.postgresql.org/docs/8.4/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP
>>
>> cant speak for other platforms though you'd have to consult their documentation.
>
>
> PG also supports MATCH since I see we have some unit tests for that, i.e. column.match(other). I'm not seeing offhand in PG's docs what it interprets the "MATCH" operator as, i.e. is it a "~", "SIMILAR TO", not sure.
>
> I guess the reasons I've never had interest in regexp matching in databases are:
>
> 1. its always bad to search through tables without being able to use indexes
> 2. if you're needing to dig into text, it suggests the atoms of that text should be represented individually in their own column (i.e. normalize)
> 3. no really, I'm doing flat out full text searching on documents and don't want to reinvent. Well then I'd use a full blown text extension (http://www.postgresql.org/docs/8.3/static/textsearch.html) or a separate search engine.

The SQLAlchemy "match" operator compiles to PostgreSQL full-text search (@@ to_tsquery).
test/dialect/test_postgresql.py

Cheers,
M

Grimsqueaker

unread,
Sep 20, 2010, 9:13:17 AM9/20/10
to sqlalchemy
Might be useful:

http://www.regular-expressions.info/mysql.html
http://www.regular-expressions.info/oracle.html

I used the Oracle version like this:
func.regexp_replace(BillUnion.calldestination, '^\+([0-9]{2})
(0[2-578])?([0-9]*)$', '\\1\\3', 1).label('calldestination')

It is really slow though... :(


On Sep 17, 5:29 pm, "A.M." <age...@themactionfaction.com> wrote:
> On Sep 17, 2010, at 11:12 AM, Michael Bayer wrote:
>
>
>
>
>
>
>
> > On Sep 17, 2010, at 10:58 AM, Michael Bayer wrote:
>
> >> On Sep 17, 2010, at 9:14 AM, Michael Hipp wrote:
>
> >>> On 9/14/2010 2:23 PM, Michael Hipp wrote:
> >>>> Is it possible to use a regexp in a like() clause? Or some other way to achieve
> >>>> something similar?
>
> >>> Can anyone suggest an approach to search a field with a regexp?
>
> >> if you were using Postgresql, you could use   somecolumn.op("~")(someregexp)
>
> >>http://www.postgresql.org/docs/8.4/interactive/functions-matching.htm...

Michael Hipp

unread,
Sep 20, 2010, 10:38:15 AM9/20/10
to sqlal...@googlegroups.com
On 9/17/2010 10:12 AM, Michael Bayer wrote:
>
> On Sep 17, 2010, at 10:58 AM, Michael Bayer wrote:
>
>>
>> On Sep 17, 2010, at 9:14 AM, Michael Hipp wrote:
>>
>>> On 9/14/2010 2:23 PM, Michael Hipp wrote:
>>>> Is it possible to use a regexp in a like() clause? Or some other way to achieve
>>>> something similar?
>>>
>>> Can anyone suggest an approach to search a field with a regexp?
>>
>> if you were using Postgresql, you could use somecolumn.op("~")(someregexp)
>>
>> http://www.postgresql.org/docs/8.4/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP
>>
>> cant speak for other platforms though you'd have to consult their documentation.
>
>
> PG also supports MATCH since I see we have some unit tests for that, i.e. column.match(other). I'm not seeing offhand in PG's docs what it interprets the "MATCH" operator as, i.e. is it a "~", "SIMILAR TO", not sure.
>
> I guess the reasons I've never had interest in regexp matching in databases are:
>
> 1. its always bad to search through tables without being able to use indexes
> 2. if you're needing to dig into text, it suggests the atoms of that text should be represented individually in their own column (i.e. normalize)
> 3. no really, I'm doing flat out full text searching on documents and don't want to reinvent. Well then I'd use a full blown text extension (http://www.postgresql.org/docs/8.3/static/textsearch.html) or a separate search engine.

Thanks for the good suggestions on this. My need is pretty simple. I have a
column that contains values like this:

A100
AA309
B101

I need something to find all the rows that look like Axxx while excluding those
that look like AAxxx. The LIKE operator doesn't seem to be able to do that. I
am using PostgreSQL so the ~ operator may be my best bet.

Scratch that ... found this message:
http://www.mail-archive.com/sqlal...@googlegroups.com/msg18598.html
which says I should be able to do a 'SIMILAR TO' construct which is perhaps
somewhat more lightweight than a full regexp.

Thanks,
Michael

Michael Hipp

unread,
Sep 20, 2010, 11:09:33 AM9/20/10
to sqlal...@googlegroups.com
On 9/20/2010 9:38 AM, Michael Hipp wrote:
> Scratch that ... found this message:
> http://www.mail-archive.com/sqlal...@googlegroups.com/msg18598.html
> which says I should be able to do a 'SIMILAR TO' construct which is perhaps
> somewhat more lightweight than a full regexp.

Can someone show me what I'm doing wrong here.

letter = 'A[0-9]+'
q = self.session.query(Car).filter_by(hist=False) \
.filter(Car.lane.op('SIMILAR TO') (letter)) \
.order_by(Car.lane)

I'm trying to match something that looks like 'A100'. But it produces a syntax
error:

OperationalError: (OperationalError) near "SIMILAR": syntax error
...snip...
WHERE cars.hist = ? AND (cars.lane SIMILAR TO ?) ORDER BY cars.lane' (False,
'A[0-9]+')

Thanks,
Michael

Michael Hipp

unread,
Sep 20, 2010, 11:54:14 AM9/20/10
to sqlal...@googlegroups.com

Strange. I couldn't see what was actually wrong with that SQL, so I ran it
directly against pg and it works fine. Is it possibly a quoting problem?

Michael

Michael Bayer

unread,
Sep 20, 2010, 11:57:08 AM9/20/10
to sqlal...@googlegroups.com

is this pg8000 ? the "?" as binds are suspect. that would be my guess.

Michael Hipp

unread,
Sep 20, 2010, 1:17:14 PM9/20/10
to sqlal...@googlegroups.com
On 9/20/2010 10:57 AM, Michael Bayer wrote:
>
> On Sep 20, 2010, at 11:54 AM, Michael Hipp wrote:
>
>> On 9/20/2010 10:09 AM, Michael Hipp wrote:
>>> On 9/20/2010 9:38 AM, Michael Hipp wrote:
>>>> Scratch that ... found this message:
>>>> http://www.mail-archive.com/sqlal...@googlegroups.com/msg18598.html
>>>> which says I should be able to do a 'SIMILAR TO' construct which is perhaps
>>>> somewhat more lightweight than a full regexp.
>>>
>>> Can someone show me what I'm doing wrong here.
>>>
>>> letter = 'A[0-9]+'
>>> q = self.session.query(Car).filter_by(hist=False) \
>>> .filter(Car.lane.op('SIMILAR TO') (letter)) \
>>> .order_by(Car.lane)
>>>
>>> I'm trying to match something that looks like 'A100'. But it produces a syntax
>>> error:
>>>
>>> OperationalError: (OperationalError) near "SIMILAR": syntax error
>>> ...snip...
>>> WHERE cars.hist = ? AND (cars.lane SIMILAR TO ?) ORDER BY cars.lane' (False,
>>> 'A[0-9]+')
>>
>> Strange. I couldn't see what was actually wrong with that SQL, so I ran it directly against pg and it works fine. Is it possibly a quoting problem?
>
> is this pg8000 ? the "?" as binds are suspect. that would be my guess.

I've never used pg8000; this is psycopg2:

>>> import psycopg2
>>> psycopg2.__version__
'2.2.1 (dt dec ext pq3)'

Any help?

Thanks,
Michael

Michael Hipp

unread,
Sep 21, 2010, 8:12:38 AM9/21/10
to sqlal...@googlegroups.com

Can anyone offer me some suggestions here? Is this a bug?

Thanks,
Michael

Michael Bayer

unread,
Sep 21, 2010, 8:23:54 AM9/21/10
to sqlal...@googlegroups.com

I'm not sure how the ? is being used for a bind param. the psycopg2 dialect uses %(name)s format. SIMILAR TO works fine and you can see %(name)s format is used:

from sqlalchemy import *
from sqlalchemy.sql import column

e = create_engine('postgresql://scott:tiger@localhost/test', echo=True)

e.execute(select([literal_column("'lane'").op('SIMILAR TO')('car')]))

2010-09-21 08:19:49,339 INFO sqlalchemy.engine.base.Engine.0x...1570 select version()
2010-09-21 08:19:49,339 INFO sqlalchemy.engine.base.Engine.0x...1570 {}
2010-09-21 08:19:49,341 INFO sqlalchemy.engine.base.Engine.0x...1570 select current_schema()
2010-09-21 08:19:49,341 INFO sqlalchemy.engine.base.Engine.0x...1570 {}
2010-09-21 08:19:49,343 INFO sqlalchemy.engine.base.Engine.0x...1570 SELECT 'lane' SIMILAR TO %('lane'_1)s AS anon_1
2010-09-21 08:19:49,343 INFO sqlalchemy.engine.base.Engine.0x...1570 {"'lane'_1": 'car'}

I get the same result with psycopg2 2.0.12 and 2.2.2 .

Michael Hipp

unread,
Sep 21, 2010, 9:01:30 AM9/21/10
to sqlal...@googlegroups.com

Sigh. I'm a moron.

I'm getting this problem when running unit tests. And it took me a day and a
half to remember that I'm doing that against an SQLite in-memory database, not
pg. Which means the SIMILAR TO op is never going to work, I suppose.

So now I have to figure out how to run unit tests against postgresql or else
live with code having no test coverage. Neither are very attractive.

I apologize for taking your time. Thanks for trying to help.

Michael

Sven A. Schmidt

unread,
Sep 21, 2010, 12:54:07 PM9/21/10
to sqlalchemy
Michael,

I hope I'm not misunderstanding what your trying to achieve, but isn't
a combination of like and not like want you want to do here? As in:

create table test (
t varchar2(255)
);

insert into test values ('AA123');
insert into test values ('A0123');

select * from test where t like 'A%' and t not like 'AA%';

-> A0123

Or do you want to match "Any character, but only once" rather than "A"
explicitly? The regex you posted ('A[0-9]+) looks like it's really
'A', you're looking for (unless that's just simplified for testing
purposes.

-sas

On Sep 20, 4:38 pm, Michael Hipp <Mich...@Hipp.com> wrote:
> On 9/17/2010 10:12 AM, Michael Bayer wrote:
>
>
>
>
>
>
>
> > On Sep 17, 2010, at 10:58 AM, Michael Bayer wrote:
>
> >> On Sep 17, 2010, at 9:14 AM, Michael Hipp wrote:
>
> >>> On 9/14/2010 2:23 PM, Michael Hipp wrote:
> >>>> Is it possible to use a regexp in a like() clause? Or some other way to achieve
> >>>> something similar?
>
> >>> Can anyone suggest an approach to search a field with a regexp?
>
> >> if you were using Postgresql, you could use   somecolumn.op("~")(someregexp)
>
> >>http://www.postgresql.org/docs/8.4/interactive/functions-matching.htm...

Michael Hipp

unread,
Sep 21, 2010, 1:08:38 PM9/21/10
to sqlal...@googlegroups.com
On 9/21/2010 11:54 AM, Sven A. Schmidt wrote:
> Michael,
>
> I hope I'm not misunderstanding what your trying to achieve, but isn't
> a combination of like and not like want you want to do here? As in:
>
> create table test (
> t varchar2(255)
> );
>
> insert into test values ('AA123');
> insert into test values ('A0123');
>
> select * from test where t like 'A%' and t not like 'AA%';

Thanks, I hadn't thought of doing that. But I don't think it will work for my
case because in addition to rejecting AA% it must also reject AB%, AC%, etc.

I wish LIKE had some way to distinguish letters and numbers.

Thanks,
Michael

Reply all
Reply to author
Forward
0 new messages