jena fuseki with hdt files

110 views
Skip to first unread message

Gang Fu

unread,
Dec 15, 2015, 1:03:07 PM12/15/15
to BioHDT
Hi All,

I am trying to run Jena Fuseki server with HDT files. I downloaded and uncompressed 'jena-fuseki1-1.3.1'. I also downloaded and uncompressed 'hdt-java-rc2'. 

Then I prepared a configuration file according to http://hdt-java.googlecode.com/svn/trunk/hdt-jena/fuseki_example.ttl

I also changed 'fuseki-server' launch file according to http://www.rdfhdt.org/manual-of-hdt-integration-with-jena/

But when I tried to start the server using: ./fuseki-server --config=config-hdt.ttl 

I got error:
Exception in thread "main" java.lang.NoClassDefFoundError: com/hp/hpl/jena/assembler/Assembler


I also tried:
java -Xmx1200M -cp hdt-jena.jar:hdt-lib.jar:fuseki-server.jar org.apache.jena.fuseki.FusekiCmd --config=config-hdt.ttl 


But I got the same error message. I have copied 'hdt-jena.jar' and 'hdt-lib.jar' files into fuseki working directories, and I found the assembler class is right there, but I cannot start fuseki server anyway.

The link below contains jena fuseki files with new configuration file (config-hdt.ttl) plus hdt-jena.jar and hdt-lib.jar

Does anyone know why?


Best,
Gang

Egon Willighagen

unread,
Dec 15, 2015, 1:11:32 PM12/15/15
to bio...@googlegroups.com
On Tue, Dec 15, 2015 at 7:03 PM, Gang Fu <gangf...@gmail.com> wrote:
> hdt-java-rc2

Gang, try the version GitHub: https://github.com/rdfhdt/hdt-java

Egon


--
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: 0000-0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

Fu, Gang (NIH/NLM/NCBI) [E]

unread,
Dec 15, 2015, 1:26:44 PM12/15/15
to bio...@googlegroups.com
Thanks a lot Egon!

I am using the github project: hdt-java/hdt-fuseki now,

I just modified the 'fuseki_example.ttl' a little bit to make it point to two hdt files I have generated. But I got the following error message when I tried to start the fuseki with config file:
[fug2@virtuosodev11 hdt-fuseki]$ bin/hdtEndpoint.sh --config=fuseki_example.ttl
com.hp.hpl.jena.assembler.exceptions.AssemblerException: caught: Adjacency list bitmap and array should have the same size
doing:
root: file:///home/fug2/hdt-java/hdt-fuseki/fuseki_example.ttl#graph1 with type: http://www.rdfhdt.org/fuseki#HDTGraph assembler class: class org.rdfhdt.hdtjena.HDTGraphAssembler
root: file:///home/fug2/hdt-java/hdt-fuseki/fuseki_example.ttl#dataset with type: http://jena.hpl.hp.com/2005/11/Assembler#RDFDataset assembler class: class com.hp.hpl.jena.sparql.core.assembler.DatasetAssembler

-------------------------------------------------------------
the changed fuseki_example.ttl file is as follow:

@prefix : <#> .
@prefix fuseki: <http://jena.apache.org/fuseki#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix hdt: <http://www.rdfhdt.org/fuseki#> .

[] rdf:type fuseki:Server ;
# Timeout - server-wide default: milliseconds.
# Format 1: "1000" -- 1 second timeout
# Format 2: "10000,60000" -- 10s timeout to first result, then 60s timeout for the rest of query.
# See java doc for ARQ.queryTimeout
# ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "10000" ] ;

# IMPORTANT: Import the HDT Assembler
ja:loadClass "org.rdfhdt.hdtjena.HDTGraphAssembler" ;

fuseki:services (
<#service1>
) .

# HDT Classes
hdt:HDTGraph rdfs:subClassOf ja:Graph .

## ---------------------------------------------------------------
## Create a Read-Only Dataset composed by many RDF Graphs, each from an HDT File.

<#service1> rdf:type fuseki:Service ;
fuseki:name "hdtservice" ;
fuseki:serviceQuery "query" ;
fuseki:serviceReadGraphStore "get" ;
fuseki:dataset <#dataset> ;
.

<#dataset> rdf:type ja:RDFDataset ;
rdfs:label "Dataset" ;
ja:defaultGraph <#graph1> ;
ja:namedGraph
[ ja:graphName <http://example.org/name1> ;
ja:graph <#graph2> ] ;
.

<#graph1> rdfs:label "RDF Graph1 from HDT file" ;
rdf:type hdt:HDTGraph ;
hdt:fileName "/export/home/SSD/BIGDATA/hdt/pc_compound_0.hdt" ;

# Optional: Keep the HDT and index in memory at all times.
# Uses more memory but it is potentially faster because avoids IO.
# hdt:keepInMemory "true" ;
.

<#graph2> rdfs:label "RDF Graph2 from HDT file" ;
rdf:type hdt:HDTGraph ;
hdt:fileName "/export/home/SSD/BIGDATA/hdt/pc_compound_1.hdt" ;
.
--
You received this message because you are subscribed to the Google Groups "BioHDT" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biohdt+un...@googlegroups.com.
To post to this group, send email to bio...@googlegroups.com.
Visit this group at https://groups.google.com/group/biohdt.
To view this discussion on the web visit https://groups.google.com/d/msgid/biohdt/CAMPqvY8d_%3DZvkJ9zFwjChZ9HRuyCeQpDezqYZKX6VoU7biwncg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Gang Fu

unread,
Dec 16, 2015, 11:55:59 AM12/16/15
to BioHDT
I can launch 'hdt-java/hdt-fuseki' on top of multiple hdt files as multiple named graphs, however, nothing returned back when I tried to query single named graph.
The query:
select distinct ?g where { graph ?g {?s ?p ?o} }
can correctly return all of the named graphs
however, 
select *
from <named graph 1>
where {?s ?p ?o}
limit 100

returned nothing...
Any comment?

Gang Fu

unread,
Dec 16, 2015, 12:21:32 PM12/16/15
to BioHDT
It seems to me that 'hdt-fuseki' can only work with default graph instead of named graph, am I missing something?

Gang Fu

unread,
Dec 16, 2015, 2:49:48 PM12/16/15
to BioHDT
It seems FROM and FROM NAMED clauses do not work, but if I do 
select * where {graph ?g {?s ?p ?o}} limit 100
it does work

Egon Willighagen

unread,
Dec 17, 2015, 2:38:48 AM12/17/15
to bio...@googlegroups.com
Hi Gang,

regarding Fuseki specifically, I'm afraid I won't be of much help... I
do not use that, and do the SPARQL queries directly on the Jena Model,
and nog via a SPARQL end point... here, I'm using the hdt-java feature
to expose the file as a Model object.

I hope someone else can be of help? Maybe there is a general (non-bio)
mailing list too, which more people? So far, I communicated via the
issue tracker on GitHub :) That has by default room for questions, so
could perhaps also be a place to ask a wider audience about Fuseki?

Egon
> To view this discussion on the web visit https://groups.google.com/d/msgid/biohdt/88ECCBAB9C61764984354ACC905D63341C55C0E6%40msgb01.nih.gov.
> For more options, visit https://groups.google.com/d/optout.



Egon Willighagen

unread,
Dec 17, 2015, 2:42:54 AM12/17/15
to bio...@googlegroups.com
One limitation I have always experienced with Jena is that is
processes the SPARQL... that means that it supports its own
implementation. Now, the are compliant with the standard, but for some
time implemented SPARQL 1.0 while 1.1 was already used online... I
sidestepped that by not using Jena for remote SPARQL queries...

Now, I bring this up, because I'm wondering if this SPARQL syntax is
supported in the Jena 2.1x version used in hdt-java? Maybe you could
try the Jena3 based version, which I think they have since about a
week?

Or maybe that has nothing to do with it, and it really is in the
hdt-java implementation and this is just noise...

(Just hoping you find an answer :)

Egon
> --
> You received this message because you are subscribed to the Google Groups
> "BioHDT" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to biohdt+un...@googlegroups.com.
> To post to this group, send email to bio...@googlegroups.com.
> Visit this group at https://groups.google.com/group/biohdt.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/biohdt/e5b81f61-dda7-429c-a6ed-c720dad36e64%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages