query with contains ....

77 views
Skip to first unread message

Dave S

unread,
Dec 29, 2017, 3:24:10 AM12/29/17
to web2py-users
I figured out how to iterate through request.vars to do a query with multiple .contains().  Now I want to refine that further.  Is there a convenient way, when the field being queried is a "string list", to have the contains match only on word boundaries, or do I have accept partial matches and do further filtering on the results (the ROWS object).

By "string list", I mean something similar to how list fields are implemented for sqlite3 backends:  the field itself is a string, and the list elements are bounded by a separator.  I'm using ',' instead of "|", but you may consider that a quirk for now.  If I'm searching for "bit", I don't want "bitter" in the results.

If I have to do further filtering, does filter() work on ROWS objects?

BTW, I may later on replace this field with an official list-type field.  So of course I want deluxe solution that will continue work once I get around to that.

/dps



Anthony

unread,
Dec 29, 2017, 9:16:04 AM12/29/17
to web2py-users
You could do something like:

query = (db.mytable.myfield.like('mytext,%') |    # First item in list
         db
.mytable.myfield.like('%,mytext,%') |  # Neither first nor last item
         db
.mytable.myfield.like('%,mytext') |    # Last item in list
         db
.mytable.myfield.like('mytext'))       # Single item (no list)

If you make sure to start and end every list (even single item lists) with a separater (e.g., ",a,b,c," instead of "a,b,c"), then you can simplify it to just:

db.mytable.myfield.like('%,mytext,%')

The above is how the DAL handles list:-type fields.

Another alternative is a regexp search:

db.mytable.myfield.regexp(r'\bmytext\b')

The above will search for "mytext" between any word boundaries -- if the list items themselves might include word boundaries (e.g., spaces), then you'll need a more precise regular expression. 

Be careful about allowing arbitrary user input for a regexp query (see caution here: https://www.postgresql.org/docs/current/static/functions-matching.html).

Anthony

Dave S

unread,
Dec 29, 2017, 3:50:24 PM12/29/17
to web2py-users


On Friday, December 29, 2017 at 6:16:04 AM UTC-8, Anthony wrote:
You could do something like:

query = (db.mytable.myfield.like('mytext,%') |    # First item in list
         db
.mytable.myfield.like('%,mytext,%') |  # Neither first nor last item
         db
.mytable.myfield.like('%,mytext') |    # Last item in list
         db
.mytable.myfield.like('mytext'))       # Single item (no list)

If you make sure to start and end every list (even single item lists) with a separater (e.g., ",a,b,c," instead of "a,b,c"), then you can simplify it to just:

db.mytable.myfield.like('%,mytext,%')

The above is how the DAL handles list:-type fields.

Another alternative is a regexp search:

db.mytable.myfield.regexp(r'\bmytext\b')

The above will search for "mytext" between any word boundaries -- if the list items themselves might include word boundaries (e.g., spaces), then you'll need a more precise regular expression. 

Be careful about allowing arbitrary user input for a regexp query (see caution here: https://www.postgresql.org/docs/current/static/functions-matching.html).

Anthony

Thank you.  I can be picky about what characters are allowed in the request, and the intended use won't be harmed by the restriction, so I may go with the regex.  (And a python regex to check the user input.)


/dps
 

Dave S

unread,
Jan 20, 2018, 3:22:36 AM1/20/18
to web2py-users


On Friday, December 29, 2017 at 12:50:24 PM UTC-8, Dave S wrote:


On Friday, December 29, 2017 at 6:16:04 AM UTC-8, Anthony wrote:
You could do something like:

query = (db.mytable.myfield.like('mytext,%') |    # First item in list
         db
.mytable.myfield.like('%,mytext,%') |  # Neither first nor last item
         db
.mytable.myfield.like('%,mytext') |    # Last item in list
         db
.mytable.myfield.like('mytext'))       # Single item (no list)

If you make sure to start and end every list (even single item lists) with a separater (e.g., ",a,b,c," instead of "a,b,c"), then you can simplify it to just:

db.mytable.myfield.like('%,mytext,%')

The above is how the DAL handles list:-type fields.

Another alternative is a regexp search:

db.mytable.myfield.regexp(r'\bmytext\b')

The above will search for "mytext" between any word boundaries -- if the list items themselves might include word boundaries (e.g., spaces), then you'll need a more precise regular expression. 

Be careful about allowing arbitrary user input for a regexp query (see caution here: https://www.postgresql.org/docs/current/static/functions-matching.html).

Anthony

Thank you.  I can be picky about what characters are allowed in the request, and the intended use won't be harmed by the restriction, so I may go with the regex.  (And a python regex to check the user input.)


/dps



It's probably worth a couple notes about the regex syntax.  It depends, of course, on what support the DB engine has for regular expressions.  So far, I've only looked at this for Sqlite3 -- which doesn't have its own regex support.  Instead, the application provides a function (a callback, I'd call it) for that purpose.  If you're Googling for more information, you'll probably see that the usual package is PERLRE, and you might use the references for that to try to figure out how to compose your more esoteric r.e.s.   But in the web2py environment,, it's the DAL adapter for Sqlite3 that provides the function, and it just uses the Python standard library routines.  Very similar, but not quite identical, Look here for details

The two packages agree on how to match a string that doesn't contain a pattern xyz:

v = "xyz"
db
.mytable.tags.regexp(r"(?!" + v + r")")

Lots of people here probably had this figured out already, but if this post saves an hour for some small child whose assignment is due tomorrow ....

/dps

Reply all
Reply to author
Forward
0 new messages