Databricks & distinct & order by

18 views
Skip to first unread message

Johan Van Noten

unread,
Feb 7, 2023, 5:37:13 AM2/7/23
to ontop4obda
As already mentioned here about two years ago:
some platforms complain when order by is used in combination with Distinct.

My situation:
  • Using Databricks (one of these platforms)
  • My sparql query uses only "order by"
  • Ontop 5.0.1 inserts a DISTINCT on the query
  • As a consequence, the query fails

My question:
It is unclear to me whether any of the configuration settings would be able to avoid this behavior. Am I overlooking some of these settings?

Thanks,
Johan

Benjamin Cogrel

unread,
Feb 7, 2023, 7:30:15 AM2/7/23
to Johan Van Noten, ontop4obda
Hi Johan,

Is the ORDER BY condition projected by your SPARQL query? What is the error message?

The DISTINCTs are inserted most likely due to missing unique constraints, which needs to be specified manually with Databricks.
This can be disabled by with the following option: ontop.cardinalityMode = LOOSE (https://ontop-vkg.org/guide/advanced/configuration).
However beware that aggregations may give wrong results if you have some denormalized data.

The DISTINCTs are inserted to guarantee that the virtual graph does not contain duplicated triples, as an RDF graph should be a set of triples.

We need to double-check but the issue 411 has probably been fixed already.

Best,
Benjamin

--
Please follow our guidelines on how to report a bug https://ontop-vkg.org/community/contributing/bug-report
---
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ontop4obda/b76af3c2-f308-4e94-a6e9-8f7b1b79bb4an%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages