virtual graphs performance

2 views
Skip to first unread message

PaulZH

unread,
Jun 6, 2016, 2:40:21 AM6/6/16
to Stardog

Doing following query on a virtual graph.


./stardog query CRAB "SELECT * {GRAPH <virtual://CRAB> {?x a <http://www.w3.org/ns/locn#Address>}} LIMIT 5"

+-------------------------------------------+

|                     x                     |

+-------------------------------------------+

| http://id.vlaanderen.be/id/adres/84401 |

| http://id.vlaanderen.be/id/adres/84402 |

| http://id.vlaanderen.be/id/adres/84403 |

| http://id.vlaanderen.be/id/adres/84404 |

| http://id.vlaanderen.be/id/adres/84405 |

+-------------------------------------------+


Query returned 5 results in 00:02:30.072


Ay suggestion or tips to speed this type of query up?


Other queries are fine.


./stardog query CRAB "SELECT * {GRAPH <virtual://CRAB> {<http://id.vlaanderen.be/id/adres/84405> ?p ?v.}} LIMIT 5"

+-------------------------------------+--------------+

|                  p                  |      v       |

+-------------------------------------+--------------+

| http://purl.org/dc/terms/identifier | 84405        |

| rdf:type                            | locn:Address |

+-------------------------------------+--------------+


Query returned 2 results in 00:00:00.091



Evren Sirin

unread,
Jun 6, 2016, 2:26:04 PM6/6/16
to Stardog
This is probably because currently solution modifiers in SPARQL
(including LIMIT) are not pushed into SQL. If you have a selective
query this is not a problem but otherwise SQL query might be
expensive. We'll be fixing this issue soon.

Best,
Evren
> --
> -- --
> You received this message because you are subscribed to the C&P "Stardog"
> group.
> To post to this group, send email to sta...@clarkparsia.com
> To unsubscribe from this group, send email to
> stardog+u...@clarkparsia.com
> For more options, visit this group at
> http://groups.google.com/a/clarkparsia.com/group/stardog?hl=en

PaulZH

unread,
Jun 8, 2016, 3:24:53 AM6/8/16
to Stardog
Hi Evren,

Good to hear.

Another type of query we have some concerns with is following:


 

./stardog query CRAB "SELECT DISTINCT ?type {GRAPH <virtual://CRAB> {?x a ?type.}}"

+------------------------------------+

|                type                |

+------------------------------------+

| http://dbpedia.org/ontology/City   |

| locn:Address                       |

| http://dbpedia.org/ontology/Street |

+------------------------------------+


 

Query returned 3 results in 00:03:25.851

Evren Sirin

unread,
Jun 8, 2016, 5:02:07 PM6/8/16
to Stardog
Do you have a large number of mappings? Also what SQL backend are you using?

Best,
Evren

Paul Hermans

unread,
Jun 9, 2016, 3:43:47 AM6/9/16
to sta...@clarkparsia.com
Hi Evren,

Nr. of mappings: not that many
(I can send you the R2RML file offline).
The SQL backend: Microsoft SQL Server on MS Azure cloud (private)
(If useful, I can give you access to).

Paul


Kind Regards,
Paul Hermans



-------------------------

ProXML bvba

Linked Data services

KBO: http://data.kbodata.be/organisation/0476_068_080#id

(w) www.proxml.be <http://www.proxml.be/>

(e) pa...@proxml.be

(tw) @PaulZH

(t) +32 15 23 00 76

(m) +32 473 66 03 20





Narcisweg 17

3140 Keerbergen

Belgium



ODEdu – Innovative Open Data Education and Training based on PBL and Learning Analytics - http://odedu-project.eu/
OpenGovIntelligence – Public Administration Modernization by exploiting Linked Open Statistical Data - http://www.opengovintelligence.eu
OpenCube – Linked Open Statistical Data - http://opencube-project.eu/
>You received this message because you are subscribed to a topic in the Google Groups "Stardog" group.
>To unsubscribe from this topic, visit https://groups.google.com/a/clarkparsia.com/d/topic/stardog/uJQM_PLTsOQ/unsubscribe.
>To unsubscribe from this group and all its topics, send an email to stardog+u...@clarkparsia.com.
>


nata...@gmail.com

unread,
Jun 14, 2016, 4:31:05 AM6/14/16
to Stardog
Hey Paul,

What you are doing here is kind of querying all virtual mappings and reporting URI and Type.

Given the fact you are doing it on a relational database database means you are selecting all "rows" for "every mapping" (table).
This is kind of cool for an RDF store but not for a RDMBS.

In my experience you need to really carefully design your mappings (step by step) and keep track of SQL that is generated to your database.

Zachary Whitley

unread,
Jun 14, 2016, 6:24:18 AM6/14/16
to sta...@clarkparsia.com


On Jun 14, 2016, at 4:31 AM, nata...@gmail.com wrote:

Hey Paul,

What you are doing here is kind of querying all virtual mappings and reporting URI and Type.

Given the fact you are doing it on a relational database database means you are selecting all "rows" for "every mapping" (table).
This is kind of cool for an RDF store but not for a RDMBS.

In my experience you need to really carefully design your mappings (step by step) and keep track of SQL that is generated to your database.



You can see what SQL is being generated by changing the log level. Stardog uses log4j2. I'm not sure exactly what logger you need to target for the generated SQL.

---
You received this message because you are subscribed to the Google Groups "Stardog" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stardog+u...@clarkparsia.com.

Evren Sirin

unread,
Jun 15, 2016, 12:01:25 AM6/15/16
to Stardog
The queries in the form {?s ?p ?o} and {?s a ?type} are hard as Natan
points out and current SQL generation is not as optimized as it could
be in these cases. We have a ticket to improve such queries but asking
for distinct types should not be as bad. We'll take look at this.

Inspecting generated SQL queries is a good idea to understand what is
going on. In the 4.1.1 version released earlier today we changed the
output of query explain command to show the generated SQL query for
virtual graphs so you don't need to deal with loggers any more.

Best,
Evren

nata...@gmail.com

unread,
Jun 15, 2016, 2:33:56 AM6/15/16
to Stardog
Another tip is to consider disabling URI escaping (#2869) (if possible of course) which can improve performance with a factor 10.

N

Zachary Whitley

unread,
Jun 15, 2016, 6:28:33 AM6/15/16
to sta...@clarkparsia.com
Can you give a little explanation about what uri escaping is referring to and what the option name is. I don't think it made its way into the docs. 
--

Natan Cox

unread,
Jun 15, 2016, 6:59:29 AM6/15/16
to Stardog
Ok. This feature I only used in Ontop. More details can be found here: https://github.com/ontop/ontop/wiki/Ontop-Preferences.

Basically you have to set the property org.obda.owlreformulationplatform.sqlGenerateReplace to false. This can only be done if the URI you are building will not contain any of  !@#$&*[](),;:?=+'/ and space. For example primary keys fields with numbers or UUID strings will not contain weird characters so in those cases you can safely disable the sqlGenerateReplace.

Note: In 4.1 release of Stardog it is mentioned that the feature is configurable, but I dont know how either. Cfr http://docs.stardog.com/release-notes/.


N

You received this message because you are subscribed to a topic in the Google Groups "Stardog" group.
To unsubscribe from this topic, visit https://groups.google.com/a/clarkparsia.com/d/topic/stardog/uJQM_PLTsOQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stardog+u...@clarkparsia.com.

Zachary Whitley

unread,
Jun 15, 2016, 8:02:01 AM6/15/16
to sta...@clarkparsia.com
Thanks. Skip URL encoding.  I seem to remember ontop generating some interesting queries because they were doing URL encoding in SQL and calling a lot of nested string functions. I guess that's because there isn't a standard urlencode function in SQL.

 Thanks for the tip. 

Natan Cox

unread,
Jun 15, 2016, 8:11:23 AM6/15/16
to Stardog
That's the one! Enjoy.

N

Evren Sirin

unread,
Jun 15, 2016, 9:21:55 AM6/15/16
to Stardog
You can set percent.encode=false in the virtual graph options to
disable percent encoding for IRIs.

Best,
Evren
Reply all
Reply to author
Forward
0 new messages