Lucene Index from SPARQL

73 views
Skip to first unread message

Matt Goldberg

unread,
Nov 4, 2020, 5:17:11 PM11/4/20
to TopBraid Suite Users
Hello- 

Is there a mechanism like a magic property to query the freetext index from SPARQL? I didn't see one mentioned in the TopBraid documentation. I've used some other systems that have this feature.

Thanks.

Holger Knublauch

unread,
Nov 4, 2020, 6:04:54 PM11/4/20
to topbrai...@googlegroups.com

Hi Matt,

yes there are functions in the textindex namespace (http://topbraid.org/textindex#)

Open the file \server.topbraidlive.org\web\2018\textindex.ui.ttlx in TBC to see their declarations.

The name of the index for EDG vocabularies is "teamwork" - see the Text Indices admin page.

An example call would be

    ("teamwork" rdfs:comment "hello") textindex:query (?subject ?score ?literal ?graph)

If you have follow-up questions, please ask here.

Holger

--
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/ac42f5bc-323f-4254-8027-20e166d9e264n%40googlegroups.com.

Matt Goldberg

unread,
Nov 5, 2020, 9:12:05 AM11/5/20
to TopBraid Suite Users
That magic property requires a specific property to search values of, which is not quite what I want at the moment. After digging some more, I found smf:luceneQuery which appears to return similar results to Search the EDG. Is there any caveats with using this magic property? 

Holger Knublauch

unread,
Nov 5, 2020, 6:49:55 PM11/5/20
to topbrai...@googlegroups.com

Hi Matt,

smf:luceneQuery is in fact the magic property that Search the EDG is using, but without knowing more about your specific use case I would say 'use with caution' as the magic property is really meant for use with in SWP and has some quirks.

For example, the user must define a limit, facets are kind of a pain to pass through, target graphs must opt in, working copies are out of scope, and lastly all graphs in the index are queried - the only way to narrow the scope is by providing a graph facet.

But generally speaking this should work
(params are: search term, facets, offset, limit, advanced syntax (when true the magic property will use the input verbatim, essentially allowing lucene operands to be utilized)

SELECT ?result ?score ?total
WHERE {
   ("term" ?facetFilters 0 100 false) smf:luceneQuery (?result ?score ?total)
}

If the user needs facets, we'd have to work up a more thorough example, which would involve temp graphs.

So the alternative would be to use the textindex:query magic property while iterating over the properties of interest.

Holger

Matt Goldberg

unread,
Nov 6, 2020, 1:06:59 PM11/6/20
to TopBraid Suite Users
Thanks for the heads up. I'll continue experimenting.

Matt Goldberg

unread,
Nov 10, 2020, 12:27:21 PM11/10/20
to TopBraid Suite Users
Hello-

Is it true that textindex:query magic property cannot support the Lucene operators? 

Thanks.

Holger Knublauch

unread,
Nov 10, 2020, 5:24:56 PM11/10/20
to topbrai...@googlegroups.com

I guess that's true. Due to the nature of the "normal" requirements that this feature is used for, it will take the input string and post-process it beyond recognition, for example to inject the names of the target graphs, to split multiple words into AND and to wrap each term with * ... *. However, you can use OR in between words.

What operators in particular do you need?

Holger

Matt Goldberg

unread,
Nov 10, 2020, 7:20:12 PM11/10/20
to TopBraid Suite Users
There's a few use cases we're looking at. At the moment, the primary use case involves a proof of concept regarding a global search that federates a free text query. We have some other systems that also support Lucene, so the idea is that at least all the basic operators would work (at least *, ?, OR, AND, NOT) to provide moderately consistent performance across systems. I wrote a SWP service that uses smf:luceneQuery, which does work nicely in terms of the query itself, but it does run into the challenges you mentioned above.

Matt Goldberg

unread,
Aug 2, 2021, 12:14:37 PM8/2/21
to TopBraid Suite Users
Hello-

I had been successfully using smf:luceneQuery before upgrading to EDG 7. If I run the example query you provided earlier in this thread

SELECT ?result ?score ?total
WHERE {
   ("term" ?facetFilters 0 100 false) smf:luceneQuery (?result ?score ?total)
}

in EDG or TBC 6.3.2, it works fine. If I run the same query in EDG/TBC 7.0.3, I get the following error:
Failed to execute SPARQL request: java.lang.IndexOutOfBoundsException: Index 3 out of bounds for length 3

Any ideas why this might be happening? 
Thanks.

Holger Knublauch

unread,
Aug 2, 2021, 8:23:23 PM8/2/21
to topbrai...@googlegroups.com

I suspect you need to now provide a forth variable on the right hand side, e.g.

    ("term" ?facetFilters 0 100 false) smf:luceneQuery (?result ?score ?total ?homeGraph)

Holger

Matt Goldberg

unread,
Aug 3, 2021, 8:34:43 AM8/3/21
to topbrai...@googlegroups.com
That works, thanks.

What is the correct syntax for the facet filters? I haven't been able to find any examples anywhere.

Holger Knublauch

unread,
Aug 3, 2021, 11:48:24 PM8/3/21
to topbrai...@googlegroups.com

Hi Matt,

I need to ask a colleague and we'll get back to you. Meanwhile, looking at the source code, it seems that ?facetFilters can be a resource that is the subject of

    http://topbraid.org/saf/appliedfacet

and you can see low-level details of how those values are constructed from saf.ui.ttlx -> saf:buildAppliedFacetList

Holger

Holger Knublauch

unread,
Aug 4, 2021, 9:15:01 PM8/4/21
to topbrai...@googlegroups.com

Matt,

I have checked and we consider this API to be rather internal and it's undocumented for a reason :)

Would you be able to use GraphQL for what you want to achieve? What was your use case that you need those facets?

Holger

Matt Goldberg

unread,
Aug 4, 2021, 9:24:10 PM8/4/21
to topbrai...@googlegroups.com
I've been looking into GraphQL more recently, so it's a possibility. When I originally asked this question we had other systems that were using Lucene so we were looking for a quick way to federate a lucene query to multiple systems and aggregate the results, which that magic property worked perfectly well for.

As for now, it was mostly just curiosity about how it worked.

Thanks for the feedback.


You received this message because you are subscribed to a topic in the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/topbraid-users/T6-v3dwOk6M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to topbraid-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/topbraid-users/437b44c8-61c3-e0f6-707c-b20adbad35ac%40topquadrant.com.
Reply all
Reply to author
Forward
0 new messages