MongoDB query with multiple search terms(regexes) in c# with 10gens driver?

3,352 views
Skip to first unread message

Марјан Николовски

unread,
Feb 10, 2012, 9:08:24 AM2/10/12
to mongodb-user
If we have

Blog{
Name 'Blog1'
Tags ['testing','visual-studio','2010','c#']
}
Blog{
Name 'Blog2'
Tags ['parallel','microsoft','c#']
}
Via the console we can execute and find all blog posts that contains
some of the provided tags:

db.BlogPost.find({ 'Tags' : { '$regex' : ['/^Test/', '/^microsoft/', '/
^visual/', '/^studio/', '/^c#/'] } });
How can we write the same query in c# 10gens driver ? Is there any
alternative if it can not be written via the 10gens c# driver ?

Query.Match only support one regex. Can we provide him multiple
regexes, or we should combine

Query.Or(Query.Match("Test"), Query.Match("Micro"),
Query.Match("Visual"))
I've managed to solve it with I've managed to do it with

{ "$or" : [{ "Tags" : /^programm/i }, { "Tags" : /^microsoft/i },
{ "Tags" : /^visual/i }, { "Tags" : /^studio/i }, { "Tags" : /^assert/
i }, { "Tags" : /^2010/i }, { "Tags" : /^c#/i }] }
But something tells me that this is an ugly hack that may result in
performance issues. What do you think guys ?

Robert Stam

unread,
Feb 10, 2012, 10:00:57 AM2/10/12
to mongod...@googlegroups.com
Your first mongo shell example query is not valid (although I'm not sure why you don't get an error message).

Using $or is not a hack, it's the correct way to test against multiple regular expressions.

2012/2/10 Марјан Николовски <hybri...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Марјан Николовски

unread,
Feb 10, 2012, 10:27:39 AM2/10/12
to mongodb-user
Actually it is valid and it return results.
You can see the cmd spec at mongodb api.

I guess that this is the only way to solve it via mongodb c# driver.

Thanks for your help !

On Feb 10, 4:00 pm, Robert Stam <rob...@10gen.com> wrote:
> Your first mongo shell example query is not valid (although I'm not sure
> why you don't get an error message).
>
> Using $or is not a hack, it's the correct way to test against multiple
> regular expressions.
>
> 2012/2/10 Марјан Николовски <hybridh...@gmail.com>

Robert Stam

unread,
Feb 10, 2012, 10:36:20 AM2/10/12
to mongod...@googlegroups.com
I don't think it is valid. I just don't know why you don't get an error message. The results it returns are incorrect.

Here's a simplified example showing that it returns the wrong results. The query is intended to return all documents that have a t value starting with "x" or "y", but it obviously returns documents that don't match:

> db.test.remove()
> db.test.insert({t:"a"})
> db.test.insert({t:"b"})
> db.test.find({t:{$regex:[/^x/,/^y/]}})
{ "_id" : ObjectId("4f3530db0105ec90fee42bbd"), "t" : "a" }
{ "_id" : ObjectId("4f3530df0105ec90fee42bbe"), "t" : "b" }
>

Have you seen an example in the documentation that shows a $regex element whose value is an array of regexes? If so, where?

2012/2/10 Марјан Николовски <hybri...@gmail.com>

Марјан Николовски

unread,
Feb 10, 2012, 10:48:27 AM2/10/12
to mongodb-user
On the mongodb api site there is a specification for $regex comparison
with one regex, not with an array of regexes.
I guess i've seen it on somewhere on stackoverflow.

On Feb 10, 4:36 pm, Robert Stam <rob...@10gen.com> wrote:
> I don't think it is valid. I just don't know why you don't get an error
> message. The results it returns are incorrect.
>
> Here's a simplified example showing that it returns the wrong results. The
> query is intended to return all documents that have a t value starting with
> "x" or "y", but it obviously returns documents that don't match:
>
> > db.test.remove()
> > db.test.insert({t:"a"})
> > db.test.insert({t:"b"})
> > db.test.find({t:{$regex:[/^x/,/^y/]}})
>
> { "_id" : ObjectId("4f3530db0105ec90fee42bbd"), "t" : "a" }
> { "_id" : ObjectId("4f3530df0105ec90fee42bbe"), "t" : "b" }
>
>
>
> Have you seen an example in the documentation that shows a $regex element
> whose value is an array of regexes? If so, where?
>
> 2012/2/10 Марјан Николовски <hybridh...@gmail.com>

Sam Millman

unread,
Feb 10, 2012, 11:02:23 AM2/10/12
to mongod...@googlegroups.com
Yea,  stackoverflow many take stackoverflow as always having the right answer but it doesn't. You must specify each regex. I found that to do multiple regexs on the same field I use $or op.

2012/2/10 Марјан Николовски <hybri...@gmail.com>

Sam Millman

unread,
Feb 10, 2012, 11:04:38 AM2/10/12
to mongod...@googlegroups.com
Though if you are good at regex you can actually apply and $or inside the regex with "|" symbol I believe, though I am not a pr0 at regexing, but I do know it is possible.

Robert Stam

unread,
Feb 10, 2012, 11:15:20 AM2/10/12
to mongod...@googlegroups.com
Good point that you can use "|" inside the regular expression to match multiple alternatives, like this:

{ "Tags" : /^programm|^microsoft|^visual|^studio|^assert|^2010|^c#/i }

Since Tags is an array the regular expression will be tested against all the entries in the array and all it takes is for one entry to match for the document to be included in the results.

Марјан Николовски

unread,
Feb 10, 2012, 11:28:01 AM2/10/12
to mongodb-user
What would be the best practice, expanding the query or regex
grouping ?

On Feb 10, 5:15 pm, Robert Stam <rob...@10gen.com> wrote:
> Good point that you can use "|" inside the regular expression to match
> multiple alternatives, like this:
>
> { "Tags" : /^programm|^microsoft|^visual|^studio|^assert|^2010|^c#/i }
>
> Since Tags is an array the regular expression will be tested against all
> the entries in the array and all it takes is for one entry to match for the
> document to be included in the results.
>
>
>
>
>
>
>
> On Fri, Feb 10, 2012 at 11:04 AM, Sam Millman <sam.mill...@gmail.com> wrote:
> > Though if you are good at regex you can actually apply and $or inside the
> > regex with "|" symbol I believe, though I am not a pr0 at regexing, but I
> > do know it is possible.
>
> > On 10 February 2012 16:02, Sam Millman <sam.mill...@gmail.com> wrote:
>
> >> Yea,  stackoverflow many take stackoverflow as always having the right
> >> answer but it doesn't. You must specify each regex. I found that to do
> >> multiple regexs on the same field I use $or op.
>
> >> 2012/2/10 Марјан Николовски <hybridh...@gmail.com>

Robert Stam

unread,
Feb 10, 2012, 11:32:03 AM2/10/12
to mongod...@googlegroups.com
They are both equally correct.

I would benchmark both approaches with your data and your regular expressions and see which performs better. Use a combination of timing the queries and explain() to determine which is better for your use case.

2012/2/10 Марјан Николовски <hybri...@gmail.com>

Robert Stam

unread,
Feb 10, 2012, 11:58:34 AM2/10/12
to mongod...@googlegroups.com
FYI, I've created a server JIRA requesting that an error message be returned by the server when the value for $regex in a query is not a regular expression or a string. See:

Марјан Николовски

unread,
Feb 10, 2012, 12:04:46 PM2/10/12
to mongodb-user
Excellent !

On Feb 10, 5:58 pm, Robert Stam <rob...@10gen.com> wrote:
> FYI, I've created a server JIRA requesting that an error message be
> returned by the server when the value for $regex in a query is not a
> regular expression or a string. See:
>
> https://jira.mongodb.org/browse/SERVER-4928
>
>
>
>
>
>
>
> On Fri, Feb 10, 2012 at 11:32 AM, Robert Stam <rob...@10gen.com> wrote:
> > They are both equally correct.
>
> > I would benchmark both approaches with your data and your regular
> > expressions and see which performs better. Use a combination of timing the
> > queries and explain() to determine which is better for your use case.
>
> > 2012/2/10 Марјан Николовски <hybridh...@gmail.com>
Reply all
Reply to author
Forward
0 new messages