Hi,
I've got a synthetic dataset which consists of 10M triples in following pattern:
and the FOAF vocabulary. I query the database with RDFS reasoning, and here's the completion time outputs (from the command line) that I get from each run on 10 repetitions:
Query 1: SELECT (COUNT(*) as ?all) WHERE {?person rdf:type foaf:Person}
Time outputs :
Query 2: SELECT (COUNT(DISTINCT ?person) as ?all) WHERE {?person rdf:type foaf:Person}
Time outputs:
I have two questions regarding these numbers, if you'd be able to assist:
- Is this expected? Should the "distinct" in the query make such a difference (more than 2x)?
- Is there some sort of caching happening after the first query has run, given that there a noticeable difference between the time of completion of the first run and the consequent ones, in Query 1. Not so much in Query 2, as it seems.
Thanks a lot,
Valentino