[Avoid checking Relational Database]

138 views
Skip to first unread message

Lee Huy

unread,
Oct 29, 2024, 3:58:57 AM10/29/24
to ontop4obda

Hi Ontop Team,


I have OneCase that:

I have 2 databases, 1 is Relational Database(RD) and 2 is Graph Database(GD).

I use Ontop for feature Translate SparQL from GD to SQL in RD.

So Source Code of Team here: https://github.com/ontop/ontop , i know that it must have connection with RD and checking, 

compare metadata GD with RD. After that it have final SQL.

And my question is, Can i remove the Step connection, checking compare with RD, i mean remove all step relate to RD.

I only need SQL generated from GD, i dont need to check again with RD.

So Team have solution for that, or have some way to custom source code?

Thanks for support.

Benjamin Cogrel

unread,
Oct 31, 2024, 3:57:42 AM10/31/24
to Lee Huy, ontop4obda
Hi Lee,

Thanks for your question, it is a setting we would like to better support in the future.

You can extract the DB metadata upfront and serialize it using the command ontop extract-db-metadata (see the CLI page [1]).
You can then pass it as an argument of the command ontop endpoint and if your mapping doesn't have complex SQL expressions that are not parsed by Ontop's SQL parser, it should not need a connection to prepare the endpoint.

As for getting the SQL query, please pass the parameter `--dev` to the ontop endpoint command. Then you can send your SPARQL query to the endpoint `/ontop/reformulate` and receive the generated SQL query.

I have not double-checked but hopefully the SQL connection should not be needed here.

One limitation to note is that the generated SQL query doesn't match the signature of the SPARQL query as an extra post-processing step is expected for converting the SQL results into SPARQL results.This post-processing usually performs transformations like building IRIs from templates and so on.

We are thinking of implementing an option for getting the SPARQL query fully translated into a SQL query with the same signature and corresponding data types. This is not possible for every SPARQL query in general, only for those that are strongly typed (i.e. having only one data type per column).

Please let us know how it works for you.

Best,
Benjamin


--
Please follow our guidelines on how to report a bug https://ontop-vkg.org/community/contributing/bug-report
---
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ontop4obda/6b8da1af-14b0-4126-ba52-fbc2dbb70958n%40googlegroups.com.

Lee Huy

unread,
Nov 1, 2024, 3:19:55 AM11/1/24
to ontop4obda
Hi Benjamin,

Thanks for your replying,

Many thanks for your recommendation.
I understand your point and try with your guide, it can generate SQL but need to dummy jdbc because of jdbc is required in code.
the command ontop extract-db-metadata generated a dbmetadata file which is structure of database, and when i run:
ontop endpoint --mapping r2rml.ttl --db-metadata db-metadata.json --dev ,
basically it can avoid checking connection but still need to compare or mapping with db-metadata.
Its almost perfect because i dont want to check or generate db-metadata because in  r2rml.ttl have enough metadata in it.
So Anyway, i just run, for example like this command:  ontop endpoint --mapping r2rml.ttl --dev. No need any related to database. 
Do you have any idea?

Thanks alot,
Best Regards,
Lee

Vào lúc 14:57:42 UTC+7 ngày Thứ Năm, 31 tháng 10, 2024, benjami...@ontopic.ai đã viết:

Benjamin Cogrel

unread,
Nov 2, 2024, 6:36:05 AM11/2/24
to Lee Huy, ontop4obda
Hi Lee,

Regarding "because in r2rml.ttl have enough metadata in it", this metadata is not enough for Ontop. It needs at least to know about column datatypes. It also extracts valuable info about constraints (e.g. primary keys and foreign keys).

If you don't want to have any database connection checked, please consider generating the DB metadata JSON file with your own script.

The JDBC URL is required by Ontop, but you can provide a dummy value that doesn't point to a real database. 
Please provide the `jdbc.driver` entry in the properties file so Ontop can know which SQL dialect to use and doesn't try to extract it from the JDBC URL (which would require having the JDBC driver installed and may reject your dummy URL).

Best,
Benjamin


On Fri, Nov 1, 2024 at 8:20 AM Lee Huy <lehu...@gmail.com> wrote:
Hi Benjamin,

Thanks for your replying,

Many thanks for your recommendation.
I understand your point and try with your guide, it can generate SQL but need to dummy jdbc because of jdbc is required in code.
the command ontop extract-db-metadata generated a dbmetadata file which is structure of database, and when i run:
ontop endpoint --mapping r2rml.ttl --db-metadata db-metadata.json --dev ,
basically it can avoid checking connection but still need to compare or mapping with db-metadata.
Its almost perfect because i dont want to check or generate db-metadata because in  r2rml.ttl have enough metadata in it.
So Anyway, i just run, for example like this command:  ontop endpoint --mapping r2rml.ttl --dev. No need any related to database. 
Do you have any idea?

Thanks alot,
Best Regards,
Lee

Vào lúc 14:57:42 UTC+7 ngày Thứ Năm, 31 tháng 10, 2024, benjami...@ontopic.ai đã viết:
Hi Lee,

Einar Clementz

unread,
Nov 4, 2024, 1:50:10 AM11/4/24
to ontop4obda

Hi.

Using extract-db-metadata to fetch metadata look good.

I have earlier tried out bootstrap function. Against Snowflake is has issues.

Will extract-db-metadata use other commands to Snowflake than bootstrap?

 

For Snowflake bootstrap function issue “show objects in account”. That override whatever narrower filter used in connection string, like db=abc, schema=c123.

See this conversation. https://groups.google.com/g/ontop4obda/c/a1ihaHtCVfE/m/yEGD1BgpAAAJ

I must copy database objects from Snowflake to a local SQL-server to make bootstrap function in practice. 


Thanks. 

Kind regards.
Einar Clementz. 

Lee Huy

unread,
Nov 4, 2024, 10:40:48 PM11/4/24
to ontop4obda
Hi Benjamin,

Thanks for supporting, i will base on your recommendation to keep checking this.
Regards, 
Lee

Vào lúc 17:36:05 UTC+7 ngày Thứ Bảy, 2 tháng 11, 2024, benjami...@ontopic.ai đã viết:

Benjamin Cogrel

unread,
Nov 11, 2024, 12:30:03 PM11/11/24
to Einar Clementz, ontop4obda
Hi Einar,

It is indeed closely related to the feature of request of controlling the scope of the mapping bootstrapper.
I just opened an issue to track it: https://github.com/ontop/ontop/issues/830

Best,
Benjamin

Lee Huy

unread,
Dec 9, 2024, 9:03:35 PM12/9/24
to ontop4obda
Hi Benjamin,

I'm using this function to transform from SPARQL to SQL:
reformulateIntoNativeQuery(sparql)

And when i receive the result of SQL, in ALIAS column of SELECT command will be 'col1m12', 'col1m61',...
I know The ALIAS column will be generate random 1m to avoid duplicate column.
But if now i accept the duplicate column and want to remove value '1m' behind the column. How can i do?
Do i custom ontop source code and build again or any engine that can do this step?
I try to proceed the result of this function but cant cover all cases.

Can you help me with this case?

Thanks alot,
Regards.
Vào lúc 00:30:03 UTC+7 ngày Thứ Ba, 12 tháng 11, 2024, benjami...@ontopic.ai đã viết:

Benjamin Cogrel

unread,
Dec 12, 2024, 2:51:06 AM12/12/24
to Lee Huy, ontop4obda
Hi Lee,

What you are looking for is a future feature that I described in my first email as "getting the SPARQL query fully translated into a SQL query with the same signature and corresponding data types".
It still has to be implemented.

Until then, the post-processing step remains present, as the SQL values need to be translated into SPARQL/RDF values. Because of that, the SQL variables cannot have the same names as the SPARQL ones (hence the presence of aliases).

Aliases cannot be avoided with the current implementation. However, you can minimize having transformations being post-processed by adding `ontop.avoidPostProcessing=true` to your properties file.

To eliminate these aliases, you need to customize Ontop's code. The most straightforward "hacky" way is to alter the implementation of SQLGeneratorImpl.generateSourceQuery().

Best,
Benjamin


Lee Huy

unread,
Dec 12, 2024, 10:52:21 PM12/12/24
to ontop4obda
Hi Benjamin,

That "hacky" way is insane, i customize Ontop's code and it seem works really good to eliminate these aliases. 
But some case with Aggregation Functions Like COUNT, SUM,.. it only alias with no column name. For example i have case:

SELECT (COUNT(?employee) AS ?employeeCount)
WHERE {
    ?territory a ex:Territory ;
               ex:belongsemployee ?employee .
}

And SQL i receive: SELECT COUNT(*) AS "v2" FROM (SELECT DISTINCT v2."employee_id" AS "employee_id" .... I try to handle this but seem hard. You have any another "hacky" way to handle this?
Btw, many thanks for your solution

Best Regards,
HuyLee
Vào lúc 14:51:06 UTC+7 ngày Thứ Năm, 12 tháng 12, 2024, benjami...@ontopic.ai đã viết:

Benjamin Cogrel

unread,
Dec 13, 2024, 3:44:09 AM12/13/24
to Lee Huy, ontop4obda
Hi Lee,

You can analyse the substitution of the post-processing node (a ConstructionNode) and find more renamings like v2 -> employeeCount .

Best,
Benjamin


Reply all
Reply to author
Forward
0 new messages