Full Text Search Fuzzy Operator

5 views
Skip to first unread message

Matías Parodi

unread,
Nov 28, 2013, 8:57:06 PM11/28/13
to sta...@clarkparsia.com
Hello,

The documentation says that Stardog supports the full-text-search operator ~ with weights like in foo~0.8.

I have "induct" in the database, and I'm trying to query using "induccion~0.4", which gives 0 results.

If I didn't get it wrong, "induccion~0.4" should match anything len(induccion)*(1-0.4)=5.4 distance away. So why is it not returning "induct" (I tried with all ways in 0-1 actually)?

Am I doing something wrong or is it a bug?

Thanks,
Matt

Matías Parodi

unread,
Nov 28, 2013, 9:01:56 PM11/28/13
to sta...@clarkparsia.com
I clearly meant "weights", not "ways" :)

Mike Grove

unread,
Dec 2, 2013, 10:22:44 AM12/2/13
to stardog
Hard do say without seeing a complete working example that demonstrates the behavior.  The full-text searches are a hybrid between what lucene supports and the query engine, so Lucene could be coming up with that result, but it doesn't join w/ the rest of the query and is filtered out.

Cheers,

Mike
 

Thanks,
Matt

--
-- --
You received this message because you are subscribed to the C&P "Stardog" group.
To post to this group, send email to sta...@clarkparsia.com
To unsubscribe from this group, send email to
stardog+u...@clarkparsia.com
For more options, visit this group at
http://groups.google.com/a/clarkparsia.com/group/stardog?hl=en

Matías Parodi

unread,
Dec 4, 2013, 5:45:01 AM12/4/13
to sta...@clarkparsia.com
I see what you mean, here is an example:

SELECT ?text
WHERE {
  GRAPH ?g {
    ?text <http://jena.hpl.hp.com/ARQ/property#textMatch> ( "induct" 0.0 ).
  }
}

Returns "induct".

SELECT *
WHERE {
  GRAPH ?g {
    ?text <http://jena.hpl.hp.com/ARQ/property#textMatch> ( "induccion~0.4" 0.0 ).
  }
}

Returns nothing.

The distance between "induct" and "induccion" is 4 or 5 (depending on the cost of replacing a char). Which is less than length(induccion)*(1-0.4)=5.4

Cheers,
Matt

Mike Grove

unread,
Dec 4, 2013, 7:14:32 AM12/4/13
to stardog
Can you provide the data?

Cheers,

Mike

Matías Parodi

unread,
Dec 4, 2013, 7:20:46 AM12/4/13
to sta...@clarkparsia.com
Here is an example:

---

:keyword
    rdf:type owl:DatatypeProperty;
    rdfs:range xsd:string
.

# To try querying "induccion~0.4"
:Foo :keyword "induct".

# To try querying "computer~0.4"
:Bar :keyword "computer".
:Yada :keyword "computers".
---

Matt


To unsubscribe from this group and stop receiving emails from it, send an email to stardog+u...@clarkparsia.com.

Mike Grove

unread,
Dec 9, 2013, 10:17:11 AM12/9/13
to stardog
I think this is working as expected, I verified that we're not pruning anything we shouldnt, and that Lucene is returning a FuzzyQuery.

When I search for 'computer~0.4', i get both :Bar and :Yada.  I don't get results for 'induccion~0.4' and I think that's because it's too far away.  'inducci~0.4' returns resutls, so the fuzzy matching is working.

Cheers,

Mike

Reply all
Reply to author
Forward
0 new messages