LUCENE Query Parser escape Underscore

mador...@gmail.com

unread,

Mar 10, 2014, 10:48:55 AM3/10/14

to rav...@googlegroups.com

Hey,

when i use lucene query parser parse I get the following:

QueryParser parser = new QueryParser....

parser.parse("Type:Ni_Sa_Test") => Type:"ni sa test"

How can i escape the underscore?

I wish my result to be=> Type:ni_sa_test

moreover, strange thing i saw when parsing "Type:Ni_Sa4_Test" => Type:Ni_Sa_Test

why when i'm using numbers in my query I get the expected result but only with string the underscore is been removed?

thanks

Itamar Syn-Hershko

unread,

Mar 10, 2014, 10:50:46 AM3/10/14

to rav...@googlegroups.com

Use the KeywordAnalyzer on the Type field (or simply set the field to not be Analyzed)

--

Itamar Syn-Hershko

http://code972.com | @synhershko

Freelance Developer & Consultant

Author of RavenDB in Action

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mador...@gmail.com

unread,

Mar 10, 2014, 11:24:25 AM3/10/14

to rav...@googlegroups.com

Hey Itamar,

I can't use the KeywordAnalyzer because i have my own analyzer.

what do you mean by setting the field to not be analyzed??

I have the next query:

string query = "Yellow && Type:Cars_BMW_Test"

after queryParser.parse i get the following:

+Content:Yellow +Type:"Cars BMW Test"

how can i parse one field but not the other? and still query for them both?

thanks :-)

Itamar Syn-Hershko

unread,

Mar 10, 2014, 11:26:41 AM3/10/14

to rav...@googlegroups.com

Your analyzer is tokenizing on underscore, hence what you are seeing. I suggest you read a bit on analyzed fields before using custom analyzers.

--

Itamar Syn-Hershko

http://code972.com | @synhershko

Freelance Developer & Consultant

Author of RavenDB in Action

Itamar Syn-Hershko

unread,

Mar 10, 2014, 11:27:37 AM3/10/14

to rav...@googlegroups.com

I'm also not sure why you are using the query parser directly???

--

Itamar Syn-Hershko

http://code972.com | @synhershko

Freelance Developer & Consultant

Author of RavenDB in Action

Oren Eini (Ayende Rahien)

unread,

Mar 10, 2014, 11:28:52 AM3/10/14

to ravendb

Your analyzer which does what?

mador...@gmail.com

unread,

Mar 10, 2014, 11:41:00 AM3/10/14

to rav...@googlegroups.com

My analyzer extends the standard analyzer and replaces hebrew final letters by normal ones (sorry if i made you laugh, i don't know how to describe it better in english :-)

I'm using query parser because I'm creating BooleanQuery which using Occur.SHOULD between different queries i have.

Each query is an outcome of query parser.

explained well enough??

Itamar Syn-Hershko

unread,

Mar 10, 2014, 11:59:39 AM3/10/14

to rav...@googlegroups.com

If you are trying to allow for proper Hebrew search, take a look at this: http://code972.com/hebmorph . Extending the StandardAnalyzer doesn't take into account Hebrew acronyms, for example.

Unless you are using the QueryParser on the server side, it won't have much effect and will probably cause a lot of confusion later on. I suggest you look at LuceneQuery.BeginClause and similar to do that, and use a proper analyzer for your use case (StandardAnalyzer tokenizes on underscore, for example).

--

Itamar Syn-Hershko

http://code972.com | @synhershko

Freelance Developer & Consultant

Author of RavenDB in Action

mador...@gmail.com

unread,

Mar 10, 2014, 12:09:22 PM3/10/14

to rav...@googlegroups.com

Thank you Itamar,

I'll check your suggestions.

Reply all

Reply to author

Forward