Combine $or with $regex

134 views
Skip to first unread message

Tiffany

unread,
May 17, 2012, 6:01:34 PM5/17/12
to mongod...@googlegroups.com
Hello:

We need to query something like
rsid in ["*rs980*", "*rs755*", ...]

I understand this will always work
{$or: [{"rsid": {$regex: "rs980"}}, {"rsid": {$regex: "rs755"}}, ...]}

However, is there anyway to simplify a little, so I do not need to
repeat rsid and $regex for every possible value?

We also need to support negation. like
rsid not in ["*rs980*", "*rs755*", ...]
Does anyone know how I should approach the problem?

Thank you very much!

Tyler Brock

unread,
May 18, 2012, 5:47:02 PM5/18/12
to mongod...@googlegroups.com
Hey there!

Good question,

I would suggest redesigning the way the data look as indexing will not help or be efficient for these types of queries (non-rooted regular expressions AND queries which are negations).

However if you want to continue to use this sort of query pattern you can make the expressions easier to read using the "//" syntax:

Lets say you are querying for the document where rsid matches rs980:

You could issue the query: db.collection.find({rsid: /rs980/})

Additionally could always make the regular expression more complex to encapsulate both posibilities.

For example, to find documents that have rsid matching either rs980 or 755 you could write:

db.collection.find({rsid: /rs980|rs755/})

In regular expressions the pipe symbol "|" means "OR".

-Tyler

Tiffany

unread,
May 18, 2012, 6:51:02 PM5/18/12
to mongod...@googlegroups.com
Hello:

Thank you very much for your suggestions! It does feel like it runs
faster than before inside mongodb shell, I should try it in java next.

However, I was told if we used lucene like prefixes, we could skip the
regular expression stuff, and it would run a lot faster, with google
like auto completion feature. I have searched online, but I could not
find relevent documents to support lucene/auto completion in mongodb.
Could you point me to the right direction?

Thank you so much!
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
>

Tiffany

unread,
May 18, 2012, 6:59:34 PM5/18/12
to mongod...@googlegroups.com
Hello:

It looks like in mongodb shell, it would recognize
db.collection.find({rsid: /rs980|rs755/})
there were 3 rows returned.

However, it would not recognize
db.collection.find({rsid: "/rs980|rs755/"})

While in java, I used
DBObject query = new BasicDBObject("rsid", "/rs980|rs755/");
dbtable.find(query)
it generated no return. I suspect it considered "/rs980|rs755/" as
string instead of
regular expression. I have tried
DBObject query = new BasicDBObject('rsid", new
BasicDBObject("$regex", "/rs980|rs755/")), it does not help, neither.

Could you tell me what I had done wrong? Thanks!

On 5/18/12, Tyler Brock <ty...@10gen.com> wrote:

Eliot Horowitz

unread,
May 18, 2012, 9:09:26 PM5/18/12
to mongod...@googlegroups.com
Try:
new BasicDBObject("$regex", "rs980|rs755")

the slashes are just how you create a regex in javascript.

Tiffany

unread,
May 21, 2012, 4:16:49 PM5/21/12
to mongod...@googlegroups.com
I have tried your suggestion. It worked! However, whenever regex is
involved, it gets quite slow. And if there are multiple $regex
together, it is very slow. Is there anyway to improve the
performance?

Does mongodb support any kind of google-like auto completion, so when
user starts to type something, it would list the protential candidates
in a drop down?

Thanks a lot for your help!

Sam Millman

unread,
May 21, 2012, 4:20:48 PM5/21/12
to mongod...@googlegroups.com
Not really, regex can be notoriously slow if you don't optimise the paths, plus you got a problem of full table scan so you have two things working against you there.


"Does mongodb support any kind of google-like auto completion, so when
user starts to type something, it would list the protential candidates
in a drop down?"

I don't think any db supports that since it is something done on client side from results. Infact Google uses a dictionary (forgotten where they source the dictionary now) not search results for auto completion since using search results would just be hash and also using a dictionary is actually sometimes better since it motivates people to search in full FTS making lighter work for google in general.
Reply all
Reply to author
Forward
0 new messages