case insensitive and exact match queries

1,504 views
Skip to first unread message

Daniel Steigerwald

unread,
Jul 29, 2010, 4:31:24 PM7/29/10
to ravendb
.Where("AuthorId:" + user.Id + " AND Text:*" + searchedText + "*")

Is possible to have searchedText search case insensitive and with
exact match (phrase)?

Ayende Rahien

unread,
Jul 29, 2010, 4:40:29 PM7/29/10
to rav...@googlegroups.com
By default searches are case sensitive.
This query should do it for you:
AuthorId:authors/129 AND Text:"a full phrase"

Daniel Steigerwald

unread,
Jul 29, 2010, 4:52:08 PM7/29/10
to ravendb
I suppose you wanted to write: By default searches are case
INsensitive. They are sensitive for Query(.. where contains..

Btw: Query is difficult to use for newbies, since it is undocumented
and half working both.
Several basic LuceneQuery examples would be immense useful.

Btw2: I want "contains" of exact match. Like:
text.indexOf("I am exact match") > -1

Ayende Rahien

unread,
Jul 29, 2010, 5:23:51 PM7/29/10
to rav...@googlegroups.com
Query is just linq, and we try to make it as linqy as possible, given the underlying implementation.
LuceneQuery is Lucene syntax, and that is pretty well document.

Can you explain a bit more what problems you run into?

Daniel Steigerwald

unread,
Jul 29, 2010, 5:33:31 PM7/29/10
to ravendb
Sure.. I am still experimenting.. it seems like some strange
whitespace bug.

.Where("AuthorId:" + user.Id + " AND Text:*naučit *")
.Where("AuthorId:" + user.Id + " AND Text:*naučit *")

Two identical queries, but only one works ;)
The problem is in that whitespace at the end.

Maybe it's nbsp...

Ayende Rahien

unread,
Jul 29, 2010, 5:38:44 PM7/29/10
to rav...@googlegroups.com
I am sorry, but can you show me what the actual query _is_ ? (the actual string, after all the formatting.

Daniel Steigerwald

unread,
Jul 29, 2010, 5:40:29 PM7/29/10
to ravendb
Damn, it was non breaking space ;) Sorry :)

This fixed it:

string cellText = "String with non breaking spaces.";
cellText = Regex.Replace(cellText, @"\u00A0", " ");

.Where(Regex.Replace("AuthorId:" + user.Id + " AND Text:*naučit *",
@"\u00A0", " "))

Daniel Steigerwald

unread,
Jul 29, 2010, 5:43:37 PM7/29/10
to ravendb
Ayende, what would be the right solution?
1) to remove non breaking spaces in index definition (I don't like)
2) to remove non breaking spaces from text somehow
3) has lucene option how to deal with non breaking spaces?

Ayende Rahien

unread,
Jul 29, 2010, 5:47:23 PM7/29/10
to rav...@googlegroups.com
To write an analyzer for lucene that would translate non breaking spaces to spaces.
You can probably just define the StandardAnalyzer with the nbsp as a stop symbol.

Daniel Steigerwald

unread,
Jul 29, 2010, 6:27:25 PM7/29/10
to ravendb
Damn, I was happy precociously. So, yet another question.
It's possible to mix exact match and wildcards?

What I want: I want to search phrase, e.g. several words jointed with
spaces in while Text property.

Is this correct? Because it doesnt work.

.Where("AuthorId:" + user.Id + " AND Text:*\"tohle m\"*")

Daniel Steigerwald

unread,
Jul 29, 2010, 6:30:43 PM7/29/10
to ravendb
Ohh "Lucene supports single and multiple character wildcard searches
within single terms (not within phrase queries)."

So it is impossible to search whole phrases in text?

Daniel Steigerwald

unread,
Jul 29, 2010, 6:43:04 PM7/29/10
to ravendb
I see light at the end of the tunnel:

searchedText = "end of the tunnel"

string where = "AuthorId:" + user.Id;
if (searchedText != string.Empty)
{
var chunks = searchedText.Split(' ');
foreach (var chunk in chunks)
where += " AND Text:*" + chunk + "*";
}
return MvcApplication.RavenSession.LuceneQuery<Matter>("Matters/
Stream")
.Where(where)
.OrderByDescending(m => m.Updated)
.ToArray();

Is this

Tobias Grimm

unread,
Jul 29, 2010, 6:59:36 PM7/29/10
to rav...@googlegroups.com
Am Donnerstag, den 29.07.2010, 15:43 -0700 schrieb Daniel Steigerwald:

> where += " AND Text:*" + chunk + "*";

Just a side note: Lucene searches with leading wildcards might be
expensive and should be used with care.

Tobias

Daniel Steigerwald

unread,
Jul 29, 2010, 7:27:28 PM7/29/10
to ravendb
Thank you for noted it.
I can't believe that something like: foo.contains('ohh bla bla ohh')
is not supported for text searching.

Ayende Rahien

unread,
Jul 29, 2010, 9:03:28 PM7/29/10
to rav...@googlegroups.com
Huh?

MyField:"blah blah blah" 

Daniel Steigerwald

unread,
Jul 29, 2010, 9:08:21 PM7/29/10
to ravendb
I would like to search "text with spaces" in arbitrary document
property. Nothing special if I can say that.

Ayende Rahien

unread,
Jul 29, 2010, 9:11:46 PM7/29/10
to rav...@googlegroups.com
Here is how you do that:

Content:"text with spaces" 

Daniel Steigerwald

unread,
Jul 29, 2010, 9:17:45 PM7/29/10
to ravendb
MyField:"blah blah blah" is -> MyField == "blah blah blah"
I need MyField.contains("blah blah blah")

Ayende Rahien

unread,
Jul 29, 2010, 9:18:53 PM7/29/10
to rav...@googlegroups.com
Nope, not in Lucene
It means contains
We actually have to work hard to be able to do ==

Tobi

unread,
Jul 30, 2010, 5:30:53 AM7/30/10
to rav...@googlegroups.com
Am 30.07.2010 03:18, schrieb Ayende Rahien:

> It means contains
> We actually have to work hard to be able to do ==

Which means:

Creating an Index with a NotAnalyzed field and doing queries like this:

[[MyField:blah blah blah]]

When coming from a SQL database besides the non-relational issue Lucene
IMHO is the second biggest thing to learn when switching to RavenDB.

But Lucene is very well documented and playing around with Luke for a
while did help me very much to wrap my mind around it.

http://code.google.com/p/luke/

Tobias

Ayende Rahien

unread,
Jul 30, 2010, 5:43:31 AM7/30/10
to rav...@googlegroups.com
It might be a good topic for a FAQ, how to do various queries in relational and raven form.

2010/7/30 Tobi <lista...@e-tobi.net>

Daniel Steigerwald

unread,
Jul 31, 2010, 8:39:46 AM7/31/10
to ravendb
I am just curious, how leading wildcards can be more expansive than
ending?
Leading wildcard is just ending wildcard on flipped text.
Text: abcdefg
Leading search: *bc

If it is expansive, does this help?

Flipped text: gfedcba
Ending search: cb*



On 30 čnc, 00:59, Tobias Grimm <listacco...@e-tobi.net> wrote:

Ayende Rahien

unread,
Jul 31, 2010, 8:48:25 AM7/31/10
to rav...@googlegroups.com
Daniel,
Yes & no.

Leading wildcard search means having to scan the entire index. While ending wildcard gives you at least a place to scan things.
The problem is that Lucene doesn't maintain a reverse text field, you could if you wanted to, but it doesn't do it for you (it would waste space).
Reply all
Reply to author
Forward
0 new messages