NGramAnalyzer is not working

19 views
Skip to first unread message

Nima Ha

unread,
Sep 14, 2014, 9:48:00 AM9/14/14
to rav...@googlegroups.com
I wrote a test just like  this gist but it returns nothing . Here's the test :

[Fact]
        public void NGramAnalyzerTest()
        {
            using (var store = NewDocumentStore(runInMemory:false,dataDir:"Data"))
            {

                var p1 = new Person { Age = 18, Name = "John" };
                var p2 = new Person { Age = 21, Name = "Joe" };
                var p3 = new Person { Age = 27, Name = "Andy" };
                var p4 = new Person { Age = 31, Name = "Linda" };
                var p5 = new Person { Age = 45, Name = "Laura"};

                using (var session = store.OpenSession())
                {
                    session.Store(p1);
                    session.Store(p2);
                    session.Store(p3);
                    session.Store(p4);
                    session.Store(p5);
                    session.SaveChanges();
                }

                store.DatabaseCommands.PutIndex("PersonByName", new IndexDefinition
                {
                    Map = "from person in docs.Persons select new { person.Name }",
                    Indexes = {{"Name", FieldIndexing.Analyzed}},
                    Analyzers = { { "Name", typeof(NGramAnalyzer).AssemblyQualifiedName } }
                },true);

                using (var session = store.OpenSession())
                {
                    var query =
                        session.Query<Person>("PersonByName")
                            .Customize(x => x.WaitForNonStaleResults())
                            .OrderBy(p=>p.Name)
                            .Search(p => p.Name, "lin");
                    Assert.Equal(1,query.ToList().Count);
                }
                
            }
        }

I am expecting that this query will find "Linda" for me.

Itamar Syn-Hershko

unread,
Sep 14, 2014, 9:50:39 AM9/14/14
to rav...@googlegroups.com
The NGram analyzer is probably configured for a gram size of != 3, for this use case you want to use a prefix query and a normal word-boundary analyzer

--

Itamar Syn-Hershko
http://code972.com | @synhershko
Freelance Developer & Consultant

--
You received this message because you are subscribed to the Google Groups "RavenDB - 2nd generation document database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ravendb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nima Ha

unread,
Sep 14, 2014, 10:00:28 AM9/14/14
to rav...@googlegroups.com
I am a new comer to Lucene .
Would you please be more specific. Here's what I actually want to . We have some documents that have addresses  I want to be able to perform a full-text search on these addresses ( with partial and complete words) how I can achieve this using Lucene ? 
Thanks a lot 

Itamar Syn-Hershko

unread,
Sep 14, 2014, 10:05:05 AM9/14/14
to rav...@googlegroups.com
You only need NGrams if you want to do partial word matches, but for your use case it is sufficient to use prefix matches. Use StandardAnalyzer and issue a prefix query using a wildcard, e.g. FieldName:lin*

--

Itamar Syn-Hershko
http://code972.com | @synhershko
Freelance Developer & Consultant

--

Nima Ha

unread,
Sep 14, 2014, 10:11:10 AM9/14/14
to rav...@googlegroups.com
OK I changed the code as follow :

using (var store = NewDocumentStore(runInMemory:false,dataDir:"Data"))
            {

                var p1 = new Person { Age = 12, Name = "John" };
                var p2 = new Person { Age = 16, Name = "Joe" };
                var p3 = new Person { Age = 7, Name = "Andy" };
                var p4 = new Person { Age = 11, Name = "Linda" };
                var p5 = new Person { Age = 5, Name = "Laura"};

                using (var session = store.OpenSession())
                {
                    session.Store(p1);
                    session.Store(p2);
                    session.Store(p3);
                    session.Store(p4);
                    session.Store(p5);
                    session.SaveChanges();
                }

                store.DatabaseCommands.PutIndex("PersonByName", new IndexDefinition
                {
                    Map = "from person in docs.Persons select new { person.Name }",
                    Indexes = {{"Name", FieldIndexing.Analyzed}},
                    Analyzers = { { "Name", typeof(StandardAnalyzer).AssemblyQualifiedName } }
                },true);

                using (var session = store.OpenSession())
                {
                    var query =
                        session.Query<Person>("PersonByName")
                            .Customize(x => x.WaitForNonStaleResults())
                            .OrderBy(p=>p.Name)
                            .Search(p => p.Name, "Li*",escapeQueryOptions: EscapeQueryOptions.AllowPostfixWildcard);
                    Assert.Equal(1,query.ToList().Count);
                }
                
            }

As you can see I am using StandardAnalyzer and "Li*" and also I use EscapeQueryOptions.AllowPostfixWildcard to tell Lucene not to escape the postfix wild card but still I am getting no results ??? whay can possibly be wrong with my code ?

Oren Eini (Ayende Rahien)

unread,
Sep 15, 2014, 1:23:16 AM9/15/14
to ravendb
 docs.Persons 

Should be:

docs.People



Oren Eini

CEO


Mobile: + 972-52-548-6969

Office:  + 972-4-622-7811

Fax:      + 972-153-4622-7811




--

Nima Ha

unread,
Sep 15, 2014, 1:55:22 AM9/15/14
to rav...@googlegroups.com
Thanks :) looks like RavenDB is smarter than me ;)
Reply all
Reply to author
Forward
0 new messages