Query data & versioning data at the same time

Rubén Navarro Piris

unread,

Apr 8, 2016, 8:24:54 AM4/8/16

to Stardog

Hi!

The java API allows sending SPARQL queries to a database or to the revision history of this database separately. Is there a way of sending a SPARQL query towards both datasets at the same time?
(An example use case would be: find all resources of type <TYPE>, which were modified last week; in this case the 'all resources of type <TYPE>' information would be in the database and 'were modified last week' would be in the revision history).

Thanks in advance!

Ruben

Zachary Whitley

unread,

Apr 8, 2016, 9:18:03 AM4/8/16

to Stardog

You might be able to try using the SERVICE keyword [1] but I'm not sure if the version info is exposed as a generic SPARQL endpoint. I wouldn't be surprised if it is but I just don't know what the url might be. I'll take a look and see if I can find anything.

[1] http://docs.stardog.com/#_federated_queries

--
-- --
You received this message because you are subscribed to the C&P "Stardog" group.
To post to this group, send email to sta...@clarkparsia.com
To unsubscribe from this group, send email to
stardog+u...@clarkparsia.com
For more options, visit this group at
http://groups.google.com/a/clarkparsia.com/group/stardog?hl=en
---
You received this message because you are subscribed to the Google Groups "Stardog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stardog+u...@clarkparsia.com.

Zachary Whitley

unread,

Apr 8, 2016, 9:25:26 AM4/8/16

to Stardog

Found it. It's in the docs at http://docs.stardog.com/#_query_version_metadata The endpoint should be

/{db}/vcs/query

Rubén Navarro Piris

unread,

Apr 8, 2016, 9:52:22 AM4/8/16

to sta...@clarkparsia.com

Thanks for the info, Zachary!

I'll take a look at that possibility, but I have 2 concerns in that matter:

- authentication: the query is sent by an authenticated user, which should be the same for the remote query (the one in the SERVICE clause). Does this SERVICE clause support authentication or only public endpoints?

- scalability: let's assume that the end result is not big, but the result of the inner query is large (e.g. the number of instances of type <TYPE> is huge, but modifications are rare). In this case, the query plan will be very inefficient, since lots of information will have to be loaded via HTTP and then merged with the result in the stardog server. If there was a way of querying information over different databases (sort of an SQL bridge mechanism), at least between a database and its associated versioning database, the query plan would most likely be much more efficient (since the information merge would occur at internal level).

What do you think?

Cheers!

Ruben

Rubén Navarro Piris

You received this message because you are subscribed to a topic in the Google Groups "Stardog" group.
To unsubscribe from this topic, visit https://groups.google.com/a/clarkparsia.com/d/topic/stardog/C5Ksn_GSj64/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stardog+u...@clarkparsia.com.

Zachary Whitley

unread,

Apr 8, 2016, 9:58:27 AM4/8/16

to Stardog

On Fri, Apr 8, 2016 at 9:52 AM, Rubén Navarro Piris <ruben.nav...@gmail.com> wrote:

Thanks for the info, Zachary!

I'll take a look at that possibility, but I have 2 concerns in that matter:

- authentication: the query is sent by an authenticated user, which should be the same for the remote query (the one in the SERVICE clause). Does this SERVICE clause support authentication or only public endpoints?

It does. You need to creat a services.sdpass file in $STARDOG_HOME. See http://docs.stardog.com/#_http_authentication for the format.

- scalability: let's assume that the end result is not big, but the result of the inner query is large (e.g. the number of instances of type <TYPE> is huge, but modifications are rare). In this case, the query plan will be very inefficient, since lots of information will have to be loaded via HTTP and then merged with the result in the stardog server. If there was a way of querying information over different databases (sort of an SQL bridge mechanism), at least between a database and its associated versioning database, the query plan would most likely be much more efficient (since the information merge would occur at internal level).
What do you think?

That's a valid concern and you'll need to keep that in mind whenever you're using the SERVICE keyword. If that becomes an issue you might want to take a look at FedX [1] that tries to apply some query optimization to the distributed query.

[1] https://www.fluidops.com/en/company/training/open_source

Reply all

Reply to author

Forward