Docker image for ontop

193 views
Skip to first unread message

Martynas Jusevičius

unread,
Jul 18, 2018, 5:49:32 AM7/18/18
to ontop4obda
Hi,

I haven't been able to use ontop yet, but the project looks promising. My experience is limited to D2RQ, years ago.

We are looking into a OBDA component for our architecture, and ontop fits the bill. However, what seems to be lacking, is a Docker image.
I am aware of https://github.com/ontop/docker-ontop. I don't think it does enough, but maybe could be used as a base image.

On a high level, what we need is pretty simple: given an RDBMS, provide SPARQL access over it.
So the input should be a JDBC URL of the DB, and the output a functional SPARQL endpoint URL.

Most importantly, the process should be fully automated, so that there would be no need for the user to switch to Protege to get the mappings or something like that.
Mappings should be generated automatically by default, or optionally mounted from a file.
Run the image with the DB parameter, and the SPARQL endpoint is up - everything else happens behind the scenes.

Do you think that is currently achievable? What would be the steps?

From my point of view, I think the first question would be: given DB connection details, how to generate default mappings from command line?
I hoped I could do this by bootstrapping as mentioned here https://github.com/ontop/ontop/wiki/OntopCLI#ontop-mapping, but after executing ontop bootstrap I got lost, because it takes a property file, and I could not find its format anywhere.

Help is much appreciated.


Martynas

Martynas Jusevičius

unread,
Jul 23, 2018, 2:59:22 AM7/23/18
to ontop4obda
So no one can offer any advice on this? IMO Docker is crucial in today's infrastructure...

Benjamin Cogrel

unread,
Jul 23, 2018, 11:47:18 AM7/23/18
to ontop4obda
Hi Martynas,

First of all, welcome on the mailing list and sorry for this late reply, some of us were traveling last week and we wanted to discuss it among ourselves before.

We think that indeed having a Docker image for Ontop that is easy to configure through environment variables, secrets [1] and files is indeed a very good and achievable idea.

Regarding mapping bootstrapping, we think it makes sense to propose it as an option, but not as a default behavior because bootstrapping is a *non-standard* way to use Ontop which, in our view, has a quite limited practical value in such a setting.

Currently in version 3, the standard place for the DB credentials is in the properties file, see for instance
https://github.com/ontop/ontop-api-examples/blob/version3/src/main/resources/example/books/exampleBooks.properties
(sorry, we don't have yet documented the "format" and the general purpose of the properties file, that's something we need to do).

However it seems actually a better idea to give the credentials as Docker secrets and perhaps the "ontop bootstrap" command should also be revisited to accept them as parameters.

For this Docker image, it probably better not to use the "classical" RDF4J war files (the Workbench and the SPARQL endpoint) since there is no point here in configuring new repositories through a Web interface. Instead, we would go for a lightweight approach (à la micro-service) with a standalone application embedding its own web server. Actually, Riccardo Tommasini already did a first nice prototype in this direction [2].

So, to sum up, a first plan could be:
 1. Create a new command line based on Spring Boot for launching a lightweight Ontop SPARQL endpoint with a mapping file.
 2. Refactor the "ontop bootstrap" command so as to make the properties file optional and to accept the DB credentials as parameters.
 3. Create a Docker image extracting the credentials for the "secret files" (optional) and using these two commands.

What do you think?

Best,
Benjamin and Guohui


[1] https://docs.docker.com/engine/swarm/secrets
[2] https://github.com/riccardotommasini/sse
--
Please follow our guidlines on how to report a bug https://github.com/ontop/ontop/wiki/BugReport
---
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Martynas Jusevičius

unread,
Jul 29, 2018, 3:58:06 PM7/29/18
to ontop4obda
Hi,

now I am late with my reply :) I am on vacation.

So what is bootstrapping exactly? And why it is non-standard? I thought it is a way to generate the default mappings for a DB which then can be customized.
A tangential question would be why ontop requires additional mappings and why standard R2RML is not enough? And also wouldn't it make sense to use RDF syntax for those mappings as well?

Yes, parameters for all values is essential. They become ENV in the Dockerfile.
I don't know much about Docker, but looks like they "are only available to swarm services, not to standalone containers". So I wouldn't worry about that right now.
Can't DB credentials also be parameters? Like here: https://hub.docker.com/_/mysql/ (Environmental variables)
Alternatively, files that are required but should no be built into the image can easily be mounted using VOLUME.

Can RDF4J server be deployed as a SPARQL endpoint only, without UI? Why would that not be sufficient? I would have expected plain RDF4J should be enough, but you seem to say an addition project is required. Jena's equivalent would be Fuseki: https://jena.apache.org/documentation/serving_data/

If there would be a need to serve the data as Linked Data, we have actually developed an open-source tool which does that from a SPARQL endpoint: https://github.com/AtomGraph/Processor

My suggestion would be to try to get something minimal running, even in a hacky way, with what there is right now, or something that can be implemented with minimal effort. Rather than start completely new developments.
We just need to turn those steps into a Dockerfile and/or entrypoint script :)

I think also it is helpful to think about scaling and automation in advance. Say if I'm a researcher and I want to map a single database, then it's not a problem for me to use Protege to generate the mappings, edit config files etc. But if you think about a platform which is spinning up containers to run many ontop instances, with DB config being entered by users in the platform UI, then both the DB config and the mapping should happen behind the scenes and be available through programmatic APIs, or even better, through machine-readable declarative syntax, for which RDF is the best choice IMO.

Martynas

Martynas Jusevičius

unread,
Jul 29, 2018, 3:59:02 PM7/29/18
to ontop4obda
I don't know much about Docker *secrets*

Benjamin Cogrel

unread,
Jul 30, 2018, 9:49:36 AM7/30/18
to ontop4obda
Hi Martynas,

"Non-standard" is perhaps not the best phrasing for bootstrapping, because it actually follows the Direct Mapping W3C recommendation… What we meant was more something like "unusual" or "uncommon" way to use Ontop. It is indeed a legitimate way to generate default mapping assertions before editing them, but it is something that is supposed to happen during the "mapping edition/design" phase, not during the deployment. I think it is important to keep the distinction between these two phases clear, and that's one of the reasons I am not in favor of using bootstrapping as a default behavior.

Using standard R2RML or our own native mapping language is really a matter of taste, nothing more. These are two alternative syntaxes, so choose the one you prefer. Whatever syntax you chose, most of the mapping processing remains the same and tries to stick to the R2RML specification. Technically speaking, R2RML is slightly more expressive than our native mapping language, so you can understand the latter as the fragment of R2RML we currently support ;-) For instance, we don't support yet named graphs and inverse expressions.

Thanks for the feedback about Docker secrets, it is indeed probably better to use ENV variables for the credentials, as Docker secrets seem to be coupled to Docker Swarm and would perhaps cause integration issues with alternative systems such as Kubernetes.

Yes, RDF4J server can be deployed as a SPARQL endpoint only, without UI, but it needs to be configured. AFAIK the configuration has to be done at runtime, using for instance the RDF4J console [1] or the workbench. In theory, one could use the console, but I am a bit concerned about the fact that the "create" command seems to run in an interactive fashion. Alternatively, one could make a SPARQL query that would insert the configuration of the Ontop repository into the appropriate system graph. I think that is what Workbench is doing.

Ok, one could give a try to such a "minimal" approach. The main advantage is that it may not require to make a new beta release of Ontop. However, in the mid-term, I would prefer a stateless, lightweight and perhaps more extensible solution like the one we proposed last week.

I agree with the main lines of the workflow you described. It corresponds more or less to what I had in mind.

[1] http://docs.rdf4j.org/server-workbench-console/

Best,
Benjamin

Martynas Jusevičius

unread,
Jul 31, 2018, 11:49:55 AM7/31/18
to Benjamin Cogrel, ontop4obda
Benjamin,

looks like we're mostly on the same page.

One thing still not so clear is the "phases" you mention. Consider the
platform scenario I had described.
The user enters JDBC details for his/hers database. The platform
starts an ontop container, passes the connection details, and
hopefully a connection can be made.
But then what? The platform could show the default mapping to the user
and allow customization etc. -- but where should it obtain it if it
hasn't been generated yet?

There are no phases in Dockerfile AFAIK. You run the image and the
service is up. You could expose an endpoint over HTTP that would
generate mappings when invoked. But I struggle to see why it would not
be a good idea to do that during deployment (i.e. in the Dockerfile)
already, and have a default, but functional service right away
(without the need to invoke anything else).

I see the mapping as an optional parameter to the container. If its
provided, ontop uses that, if not, it generates a default one and uses
that. Or am I missing something here?

I could try making the initial Dockerfile, but as a user of Jena, I
would need help with RDF4J.

Benjamin Cogrel

unread,
Aug 2, 2018, 4:56:59 PM8/2/18
to ontop4obda
Hi Martynas,

In my view, the Ontop docker image is intended to be used for deploying
a SPARQL endpoint based on mapping file; it is just a special case that
such a mapping could be generated on-the-fly (using the bootstrapping
feature). Therefore, the docker image has to do with the deployment
phase, not the mapping design phase.

Mapping design is an important step that you can rarely avoid. The only
way to avoid it is to use the bootstrapper but the result as a low value
as such. Actually the main use case for Ontop is when the relational
database is hard to query because it has a too complex schema and a lot
of "magic numbers" that are hard to interpret. If you bootstrap it, all
this complexity will reappear in the virtual RDF graph and you would
have gained almost nothing. A good mapping is something that gives a
high-level view over the data and precisely maintain some distance with
the relational schema. Currently, I think most people write their
mappings manually from scratch and do not use the bootstrapper as a
first step.

To come back to your platform scenario, after giving the JDBC
credentials, I see 3 solutions:
  1. The user uploads the mapping file (that he already designed in a
separate tool or manually)
  2. The platform redirects the user to its embedded mapping editor
(this obviously would require a lot of work)
  3. The mapping is bootstrapped (worst solution in my view)
Then, once the platform have a mapping and all the other information
(including perhaps an OWL2QL ontology), a docker image of Ontop can be
instantiated.
Of course, the third solution can be slightly simplified by letting the
docker image of Ontop bootstrap the mapping itself.

I understand the temptation of having a functional SPARQL endpoint after
just a few clicks and no effort, but I think we should resist to it. The
mapping should not be treated as something optional (even if technically
we could) but as something central which definitely deserves the
attention of the user. Bootstrapping should only be presented as a
special case.

I hope my concern about using Ontop by default without a mapping is
getting clearer :)

Sure, if you need help with RDF4J, feel free to ask.
I am currently in vacation so I cannot guarantee to be very responsive
in the next two weeks.

Best,
Benjamin

Martynas Jusevičius

unread,
Aug 7, 2018, 8:39:32 AM8/7/18
to ontop4obda
Benjamin,

I understand the importance of mappings and their design.

I am not suggesting that the bootstrapped mapping should be the final mapping. No, but it should do exactly what its name suggests: bootstrap the mapping, so that it can serve as the initial input, which can then be re-designed and morphed into the final mapping.

From your options, #2 is clearly the preferred one, looking from the perspective of the platform developer. We would not want users leaving our ecosystem, or having to install additional software, or even download/upload files if they contain something that could be generated instead.
If the mappings are in RDF, as is the case with R2RML, the basic editor might not be so complex as you might think.

But I see #2 working in combination with #3. That is, #3 would provide the default mapping for the editor. Otherwise, what would you see when you open the editor for the first time? How would the platform/editor obtain any idea of how the DB is structured? By directly reading the DB schema? This is something we would try to avoid at all costs, if instead we can retrieve the same info from the default mapping which ontop can provide.

And hopefully this functionality could be combined into a single Docker image. There would be issues such as probably having to restart the container when the mapping changes, but lets leave that for later.

I am also hope that my point is becoming clearer :)

Benjamin Cogrel

unread,
Aug 8, 2018, 5:10:10 AM8/8/18
to ontop...@googlegroups.com
Hi Martynas,

Yes, thanks, it is getting clearer.

Actually for #3, if the bootstrapped mapping is intended to be made available to the internal mapping editor so as to be improved, the platform should be in charge of bootstrapping the mapping, not the Docker image. Of course, the platform can use Ontop to bootstrap it, but in my view through its Java API (Maven dependency) or CLI, not through a Docker image dedicated to the deployment of a SPARQL endpoint.

I think the Docker image should have a single purpose and would find it a bit "hacky" to use it for getting a bootstrapped mapping file. Or perhaps, if having a Java dependency is a problem, one could design another Docker image with a specific WebAPI for bootstrapping a mapping.

Note that the code base of Ontop is immutable, so you have to create a new instance every time you change the mapping. Therefore, restarting the Docker image seems to be good enough, I don't have clear ideas how to make it better.

Fair enough, one can develop a basic R2RML editor for the platform, but one should still think about the interconnection with external advanced mapping editors because mappings can be quite large and complex. A mapping file import could do the job for a while.

Benjamin

Martynas Jusevičius

unread,
Aug 8, 2018, 9:29:25 AM8/8/18
to Benjamin Cogrel, ontop4obda
Benjamin,

having a Java dependency kind of beats the purpose of
containerization. The platform should know as little as possible about
ontop beyond that it is a SPARQL-compatible datasource. In this case,
also that a mapping has to be supplied to instantiate it.

I think it makes sense viewing ontop as a large function that given a
mapping argument, it returns a SPARQL endpoint. Which means it can
have a second variant where the argument is omitted, and therefore
auto-generated. Both cases can be implemented by mounting the mapping
file as VOLUME. You'll see :)

Right now I am just trying out ontop as I go, and then will codify the
steps in the Dockerfile. I want to look at the mappings first because
I'm not familiar with them, so naturally I'm bootstrapping them first.

What I've tried so far:

$ ./ontop bootstrap -p jdbc.properties -m mapping.obda -t ontology.owl
-b https://linkeddatahub/atomgraph/ontop/
Error: Could not find or load main class it.unibz.inf.ontop.cli.Ontop

Maybe this is due to me using bash on Windows. I will try on Windows
Linux Subsystem (Ubuntu). The .bat file worked fine on the other hand:

ontop.bat bootstrap -p jdbc.properties -m mapping.obda -t ontology.owl
-b https://linkeddatahub/atomgraph/ontop/

But not the conversion to R2RML:

λ ontop mapping to-r2rml -i mapping.obda -o mapping.ttl -t ontology.owl
it.unibz.inf.ontop.exception.InvalidMappingExceptionWithIndicator:
The syntax of the mapping is invalid (and therefore cannot be
processed). Problems:

MappingId = 'MAPPING-ID1'
Line 10: Invalid target: 'BNODE({id}, {************}, {************},
{************}, {************}, {************}, {************},
{************}, {************}, {************}, {************},
{************}) a <https://linkeddatahub/atomgraph/ontop/************>
; <https://linkeddatahub/atomgraph/ontop/************#id>
{id}^^xsd:integer ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:string ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:string ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:string ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:integer ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:boolean ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:string ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:string ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:dateTime ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:dateTime ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:string ;
<https://linkeddatahub/atomgraph/ontop/************#************>
{************}^^xsd:string .'
Debug information
Problem parsing in OBDA document.
Could not load OBDA model. Either a suitable parser could not be
found, or parsing failed. See parser logs below for explanation.
The following parsers were tried:
1) TurtleOBDASQLParser

Details:
--------------------------------------------------------------------------------
Parser: TurtleOBDASQLParser
Syntax error location: column 0
extraneous input 'BNODE' expecting {'@prefix', '@PREFIX', '@base',
'@BASE', STRING_WITH_CURLY_BRACKET, IRIREF_EXT, IRIREF, PREFIXED_NAME,
PREFIXED_NAME_EXT, BLANK_NODE_LABEL, ANON}



at it.unibz.inf.ontop.spec.mapping.parser.impl.OntopNativeMappingParser.load(OntopNativeMappingParser.java:200)
at it.unibz.inf.ontop.spec.mapping.parser.impl.OntopNativeMappingParser.parse(OntopNativeMappingParser.java:104)
at it.unibz.inf.ontop.injection.impl.OntopMappingSQLConfigurationImpl.loadPPMapping(OntopMappingSQLConfigurationImpl.java:117)
at it.unibz.inf.ontop.injection.impl.OntopMappingSQLAllConfigurationImpl.loadPPMapping(OntopMappingSQLAllConfigurationImpl.java:57)
at it.unibz.inf.ontop.injection.OntopMappingSQLConfiguration.loadProvidedPPMapping(OntopMappingSQLConfiguration.java:26)
at it.unibz.inf.ontop.cli.OntopOBDAToR2RML.run(OntopOBDAToR2RML.java:79)
at it.unibz.inf.ontop.cli.Ontop.main(Ontop.java:18)

I've masked the column names as I cannot share the schema details at
this point, unfortunately. But I hope it's enough to indicate the
issue.


Martynas

On Wed, Aug 8, 2018 at 11:10 AM, 'Benjamin Cogrel' via ontop4obda

Guohui Xiao

unread,
Aug 8, 2018, 9:51:50 AM8/8/18
to mart...@atomgraph.com, Benjamin Cogrel, ontop4obda
Hi Martynas,

We never tried "bash on Windows". This is most likely a problem related to classpath.

The problem of converting to R2RML is related to BNode and it has fixed recently (see https://github.com/ontop/ontop/issues/258). This fix will be included in the next release. Or you can compile Ontop from the version3 branch.

Cheers,

Guohui



Guohui Xiao, PhD
Assistant Professor with a fixed-term contract
KRDB - Faculty of Computer Science        
Free University of Bozen-Bolzano
Piazza Domenicani, 3                
I-39100 Bolzano, Italy    

http://www.ghxiao.org

According to the Regulation EU 2016/679, you are hereby informed that this message contains confidential information that is intended only for the use of the addressee. If you are not the addressee, and have received this message by mistake, please delete it and immediately notify us. In any case you may not copy or disseminate this message to anyone. Thank you.

Benjamin Cogrel

unread,
Aug 8, 2018, 10:27:22 AM8/8/18
to ontop4obda
Martynas,

I think I have a good intuition about the "hack" you want to do ;-)
I would say that it is ok for the moment, I understand how it goes into
your framework.

However, later on we will have to come back to these questions.
I agree that the platform should know as little as possible about Ontop
but still it has to know about the mapping, in particular if it includes
a mapping editor. And in my view, bootstrapping is a feature of the
mapping editor and should have nothing to do with the SPARQL endpoint
deployment. The dependency I mentioned has to do with the mapping editor
implementation (it can implement its own bootstrapper or reuse an
existing library such as Ontop).

Benjamin

Guohui Xiao

unread,
Aug 8, 2018, 12:46:14 PM8/8/18
to mart...@atomgraph.com, ontop4obda, Guohui Xiao
Hi Martynas,

The instruction of building Ontop is 

https://github.com/ontop/ontop/wiki/Build

Regards,

Guohui


On Wed, 8 Aug 2018 at 16:57, Martynas Jusevičius <mart...@atomgraph.com> wrote:
Guohui,

what would be the exact command to build from source (version3)? I
guess something like 'mvn clean install'. Something I can execute in
shell.

If you could combine it with the git checkout command for the branch,
it would be perfect ;)

Mariano Rodriguez Muro

unread,
Aug 8, 2018, 1:03:48 PM8/8/18
to Guohui Xiao, mart...@atomgraph.com, ontop...@googlegroups.com
Guys, just wanted to comment its really exciting to see this discussion happening. The use-cases for this kind of image are all over the place (seen them in tons of projects I've been involved in IBM). A good stable docker solution would be a HUGE boost to the projects accessibility. +1 on this work Martynas et. al :)

Cheers,
Mariano 
Mariano

Martynas Jusevičius

unread,
Aug 8, 2018, 2:26:22 PM8/8/18
to ontop4obda
Thank me when the image is functional :D But good to receive encouragement.

I see one obstacle so far: the JDBC drivers. I take they are not included in any distribution? Which makes sense.

But the container has to obtain them somehow, in an automated fashion. I would hope Maven central repository could be used for that, as it hosts JARs, for example:

But if the JDBC properties specify a class name such as com.mysql.jdbc.Driver, I am not exactly sure how to map it to the JAR URL. I don't see any indication in the POM.

A workaround could be to include a class name/JAR file mapping for the popular drivers. Ideas?

Martynas Jusevičius

unread,
Aug 8, 2018, 2:45:08 PM8/8/18
to ontop4obda
Also need to know which version of Java does ontop require, to be able to choose the base image:
Guohui


Martynas Jusevičius

unread,
Aug 8, 2018, 3:12:21 PM8/8/18
to ontop4obda
Nevermind this, found the Java version :)
Guohui


>> >> >>> endpoint: <a href="<a href="https://github.com/AtomGraph/Processor" target="_blank" rel="nofollow" onmousedown="this.href='https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2FAtomGraph%2FProcessor\x26sa\x3dD\x26sntz\x3d1\x

Martynas Jusevičius

unread,
Aug 8, 2018, 5:24:09 PM8/8/18
to ontop4obda
That's how far I got tonight:

It installs the build environment, clones the code and executes the build script, but fails to download some of the artifacts from Maven (those are from 2 different runs):

The build is executing, but some jar files cannot be downloaded from Maven:
[ERROR] Failed to execute goal on project ontop-mapping-r2rml: Could not resolve dependencies for project it.unibz.inf.ontop:ontop-mapping-r2rml:jar:3.0.0-beta-3-SNAPSHOT: Could not transfer artifact org.apache.jena:jena-osgi:jar:3.1.1 from/to central (https://repo.maven.apache.org/maven2): Failed to transfer file: https://repo.maven.apache.org/maven2/org/apache/jena/jena-osgi/3.1.1/jena-osgi-3.1.1.jar. Return code is: 503 , ReasonPhrase:first byte timeout. -> [Help 1]

[ERROR] Failed to execute goal on project ontop-rdf4j-workbench: Could not resolve dependencies for project it.unibz.inf.ontop:ontop-rdf4j-workbench:war:3.0.0-beta-3-SNAPSHOT: Could not transfer artifact org.eclipse.rdf4j:rdf4j-http-workbench:war:2.2.2 from/to central (https://repo.maven.apache.org/maven2): Failed to transfer file: https://repo.maven.apache.org/maven2/org/eclipse/rdf4j/rdf4j-http-workbench/2.2.2/rdf4j-http-workbench-2.2.2.war. Return code is: 503 , ReasonPhrase:first byte timeout. -> [Help 1]

Looks like a networking problem and not a Docker issue, but not sure how to solve it?

Docker image can be built by executing:

  docker build -t ontop/ontop .

The Dockerfile has to be in the current dir. Please do try :)

Note that I've forked the repo and created a new branch version3-docker.

corman...@gmail.com

unread,
Aug 9, 2018, 10:26:07 AM8/9/18
to ontop4obda
Hello Martynas,

I just built a Docker image from your Dockerfile without encountering any problem (just needed to apt-get install zip, in addition to git and maven).
So I guess it was indeed a network issue.

Note that if your purpose is just to set up a Docker image for an Ontop SPARQL endpoint, you don't need to build Ontop while building the docker Docker image (i.e. you don't need to run the build-release.sh script in the Dockerfile).
Unless you plan to rebuild your image regularly, with the latest version of Ontop each time.

Now in order to set up an endpoint, the easiest solution is probably to follow one of these two tutorials:
http://github.com/ontop/ontop/wiki/RDF4J-SPARQL-endpoint-installation
The tutorials asks you to download some file (either Ontop with embedded server, or some .war files).
We update these files with each release.
But if you prefer the latest (pre-release) version of Ontop, you can also build them by yourself: just build Ontop (with the build-release.sh script, as you did already), and the files will be located in build/distribution.

I just tried to deploy an endpoint within a Docker container, following the first tutorial (i.e. with embedded jetty server).
It seems to work as expected.
Just make sure to "publish" (in the Docker terminology) the container port to the corresponding host port (8080 by default).
Then you get access to the RDF4J GUI, which allows you to create an Ontop repository.
There you can specify your mapping and property files.

We are currently working on a Spring Boot alternative.
The main difference in terms of interface is that the RDF4J GUI phase will not be required anymore, i.e. the mapping and properties (and optionally the ontology) will be given as command line arguments.

Finally, you may take inspiration from this example to write the Dockerfile (or was it already mentioned in this long discussion?)
http://github.com/ontop/docker-ontop

Cheers,
Julien

Martynas Jusevičius

unread,
Aug 9, 2018, 2:56:56 PM8/9/18
to ontop4obda
Hi Julien,

yes I realized later on that I don't need to build in the Dockerfile :) I will COPY the already built artifacts instead.

I wanted a fresh build anyway since there was some bnode-related bugfix that is still not released. But my build doesn't fully complete:

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 41.560 s
[INFO] Finished at: 2018-08-09T18:36:48+00:00
[INFO] Final Memory: 51M/1357M
[INFO] ------------------------------------------------------------------------
cp: cannot stat 'target/it.unibz.inf.ontop.protege-3.0.0-beta-3-SNAPSHOT'$'\r''.jar': No such file or directory
cp: cannot stat 'it.unibz.inf.ontop.protege-3.0.0-beta-3-SNAPSHOT'$'\r''.jar': No such file or directory
        zip warning: missing end signature--probably not a zip file (did you
        zip warning: remember to use binary mode when you transferred it?)
        zip warning: (if you are trying to read a damaged archive try -F)

zip error: Zip file structure invalid (Protege-5.2.0-platform-independent.zip)
mv: cannot move 'Protege-5.2.0-platform-independent.zip' to 'ontop-protege-bundle-3.0.0-beta-3-SNAPSHOT'$'\r''.zip': No such file or directory

=========================================
 Building RDF4J distribution package
-----------------------------------------

  adding: rdf4j-server.war (deflated 0%)
  adding: rdf4j-workbench.war (deflated 0%)
        zip warning: new zip file left as: ziBuweBO
zip I/O error: No such file or directory
zip error: Could not create output file (was replacing the original zip file)

=========================================
 Building  Jetty distribution package
-----------------------------------------


.zip)rror: Zip file structure invalid (ontop-jetty-bundle-3.0.0-beta-3-SNAPSHOT


I'm running on Windows Linux Subsystem (Ubuntu). The build/distribution does not look like the root folder in ontop-distribution-3.0.0-beta-2.zip, e.g. the ontop scripts are missing. Not sure what is happening here.

Anyway I don't want to spend more time building right now, I will try to see what I can do with the beta-2 release.

Martynas Jusevičius

unread,
Aug 9, 2018, 6:56:41 PM8/9/18
to ontop4obda
Having ignored the build issues for now, I'm finally making some
progress. Right now the container manages to generate the mapping from
ENV arguments. MySQL connector is built in. The instructions follow
below for those who are interested.

I've looked at the RDF4J SPARQL endpoint installation, but I do not
understand the connection to the mapping. The server has to obtain the
mapping somehow? I'm looking how to supply it.

The Dockerfile and entrypoint can be found here:
https://github.com/AtomGraph/ontop/tree/version3-docker/docker

I place them in to the root of ontop-distribution-3.0.0-beta-2 release
for now. To build the image from that directory:

docker build -t ontop/ontop .

Now run a test DB such as MySQL:

docker run -d --name mysql -e MYSQL_ROOT_PASSWORD=obda -e
MYSQL_USER=ontop_user -e MYSQL_PASSWORD=ontop_pwd -e
MYSQL_DATABASE=ontop_db mysql:5.7.23

Now run the ontop image:

docker run -e ONTOP_JDBC_NAME=test -e
ONTOP_JDBC_URL=jdbc:mysql://172.17.0.2:3306/ontop_db -e
ONTOP_JDBC_DRIVER=com.mysql.cj.jdbc.Driver -e
ONTOP_JDBC_USER=ontop_user -e ONTOP_JDBC_PASSWORD=ontop_pwd -e
ONTOP_BASE_IRI=http://test/ ontop/ontop

You should see the mapping and the ontology in the output. Note that
172.17.0.2 address is internal to the Docker network and could be
different for you. You can find it by doing docker inspect mysql,
under NetworkSettings > IPAddress.
To expose MySQL to the host as localhost:3036, you would need to add
-p 3306:3306 before mysql:5.7.23.

Martynas Jusevičius

unread,
Aug 10, 2018, 7:11:29 PM8/10/18
to ontop4obda
My "hacky" ontop image is starting to take shape. Now it generates the mapping & ontology, starts Workbench, and creates a repository. I can access it my browser afterwards.

The problem I am facing now is that Workbench cannot find the JDBC driver:

WARNING: Server reports problem: org.eclipse.rdf4j.repository.RepositoryException: it.unibz.inf.ontop.exception.DBMetadataExtractionException: Cannot load the driver: com.mysql.cj.jdbc.Driver
Aug 10, 2018 11:04:23 PM org.eclipse.rdf4j.workbench.commands.SummaryServlet getResult
WARNING: Exception occured during async request.
java.util.concurrent.ExecutionException: org.eclipse.rdf4j.repository.RepositoryException: org.eclipse.rdf4j.repository.RepositoryException: it.unibz.inf.ontop.exception.DBMetadataExtractionException: Cannot load the driver: com.mysql.cj.jdbc.Driver
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)

Any suggestions?

The run command now looks like this:

  docker run --name=ontop -e ONTOP_JDBC_NAME=test -e ONTOP_JDBC_URL=jdbc:mysql://172.17.0.2:3306/ontop_db -e ONTOP_JDBC_DRIVER=com.mysql.cj.jdbc.Driver -e ONTOP_JDBC_USER=ontop_user -e ONTOP_JDBC_PASSWORD=ontop_pwd -e ONTOP_BASE_IRI=http://test/ -e ONTOP_REPOSITORY_ID=mysql -e ONTOP_REPOSITORY_TITLE=MySQL -p 8080:8080 ontop/ontop

Martynas Jusevičius

unread,
Aug 11, 2018, 5:17:32 AM8/11/18
to ontop4obda
I've managed to solve the problem by copying the JDBC driver jars to $JETTY_HOME/lib/ext. So the image is operational.

I am however having problems generating the mapping from the very simplest MySQL 5.7.23 DB which has 1 table with 2 columns:

Error occurred during bootstrapping: java.sql.SQLSyntaxErrorException: Unknown table 'belenkas' in information_schema
Debugging information for developers:
it.unibz.inf.ontop.exception.MappingBootstrappingException: java.sql.SQLSyntaxErrorException: Unknown table 'whateverest' in information_schema
        at it.unibz.inf.ontop.spec.mapping.bootstrap.impl.DirectMappingEngine.bootstrapMappingAndOntology(DirectMappingEngine.java:136)
        at it.unibz.inf.ontop.spec.mapping.bootstrap.impl.DirectMappingEngine.bootstrap(DirectMappingEngine.java:97)
        at it.unibz.inf.ontop.spec.mapping.bootstrap.impl.DefaultDirectMappingBootstrapper.bootstrap(DefaultDirectMappingBootstrapper.java:16)
        at it.unibz.inf.ontop.cli.OntopBootstrap.run(OntopBootstrap.java:40)
        at it.unibz.inf.ontop.cli.Ontop.main(Ontop.java:18)
Caused by: java.sql.SQLSyntaxErrorException: Unknown table 'belenkas' in information_schema

Is that a known issue or should I open a new one?

Guohui Xiao

unread,
Aug 11, 2018, 4:33:08 PM8/11/18
to Martynas Jusevičius, ontop4obda
Hi Martynas,

On Sat, 11 Aug 2018 at 11:17, Martynas Jusevičius <mart...@atomgraph.com> wrote:
I've managed to solve the problem by copying the JDBC driver jars to $JETTY_HOME/lib/ext. So the image is operational.

I am however having problems generating the mapping from the very simplest MySQL 5.7.23 DB which has 1 table with 2 columns:

Error occurred during bootstrapping: java.sql.SQLSyntaxErrorException: Unknown table 'belenkas' in information_schema
Debugging information for developers:
it.unibz.inf.ontop.exception.MappingBootstrappingException: java.sql.SQLSyntaxErrorException: Unknown table 'whateverest' in information_schema
        at it.unibz.inf.ontop.spec.mapping.bootstrap.impl.DirectMappingEngine.bootstrapMappingAndOntology(DirectMappingEngine.java:136)
        at it.unibz.inf.ontop.spec.mapping.bootstrap.impl.DirectMappingEngine.bootstrap(DirectMappingEngine.java:97)
        at it.unibz.inf.ontop.spec.mapping.bootstrap.impl.DefaultDirectMappingBootstrapper.bootstrap(DefaultDirectMappingBootstrapper.java:16)
        at it.unibz.inf.ontop.cli.OntopBootstrap.run(OntopBootstrap.java:40)
        at it.unibz.inf.ontop.cli.Ontop.main(Ontop.java:18)
Caused by: java.sql.SQLSyntaxErrorException: Unknown table 'belenkas' in information_schema

Is that a known issue or should I open a new one?

This looks like a problem related to getting metadata from a database. Please open an issue.

Thanks!

Guohui
 
> For more options, visit https://groups.google.com/d/optout.

--
Please follow our guidlines on how to report a bug https://github.com/ontop/ontop/wiki/BugReport
---
You received this message because you are subscribed to the Google Groups "ontop4obda" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ontop4obda+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Martynas Jusevičius

unread,
Aug 11, 2018, 5:16:48 PM8/11/18
to ontop4obda
Thanks, I will create an issue.

Re. the docker image, I've created a build script so now everyone should be able to reproduce the image build easily:

I've also pushed the latest version of the build to Docker Hub: https://hub.docker.com/r/atomgraph/ontop/

It obviously needs more testing, especially with a mounted ontology/mapping, but overall it seems to work. Feedback is welcome :)

Martynas
...

Martynas Jusevičius

unread,
Aug 17, 2018, 7:26:43 AM8/17/18
to ontop4obda

Benjamin Cogrel

unread,
Aug 20, 2018, 10:01:10 AM8/20/18
to ontop4obda
Hi Martynas,

Nice to see a first Docker image that is able to configure the RDF4J server on its own!

I tried the image, I managed to query the SPARQL endpoint from my command line (using curl), but not from the browser using https://yasgui.org/. Perhaps there is a CORS issue.

Here are few additional remarks:
   - Many environment variables could be made optional: ONTOP_JDBC_NAME, ONTOP_JDBC_DRIVER, ONTOP_BASE_IRI, ONTOP_REPOSITORY_ID and ONTOP_REPOSITORY_TITLE.
   - I would rename ONTOP_JDBC_PROPERTIES into ONTOP_PROPERTIES as it is a central file for configuring the system which goes far beyond JDBC concerns. This is currently overwritten, it would be nice to just overload it instead.
   - I have seen that the mapping and ontology files were appearing as volumes. Are volumes not supposed to be directories? After a quick test, I didn't manage to access these files.
  - H2 could be added to the list of open-source JDBC drivers. Perhaps proprietary drivers could be loaded from a "volume" directory.

The current version only works with the bootstrapper and does not support mapping and ontology files as input. It would be nice to support the latter case (and make bootstrapping optional).

Best,
Benjamin
--

Martynas Jusevičius

unread,
Aug 20, 2018, 10:56:19 AM8/20/18
to ontop4obda
Benjamin,

yes, a little curl magic ;) Thanks for the remarks.

if you could provide a simple test case (ontology + mapping + SQL DDL + SQL data dump) that runs on MySQL 5.7.23, I could test mounted mappings. The VOLUME part hasn't been properly tested yet (volumes can be files as well).

I tried to do that myself, but as you can see earlier in the thread, I was getting errors during mapping generation.

Pull requests are also welcome ;)
Hi Martynas,

Martynas Jusevičius

unread,
Sep 12, 2018, 9:52:52 AM9/12/18
to Guohui Xiao, ontop4obda
Guohui,

this error occurred again on a different MySQL DB, so I finally
created an issue for it: https://github.com/ontop/ontop/issues/270

It makes ontop effectively unusable for me, and blocks further Docker
image testing. I hope you can address it soon.


Martynas

Martynas Jusevičius

unread,
Sep 22, 2018, 6:32:18 AM9/22/18
to ontop4obda
Reply all
Reply to author
Forward
0 new messages