HDT - nquads

50 views
Skip to first unread message

Michel Dumontier

unread,
Aug 18, 2014, 2:33:10 PM8/18/14
to Mario Arias, bio2rdf
Dear Mario,
I've tried using the HDT tool to generate hdt files from the bio2rdf
nquad files (such as [1]), but i get a parse error regarding the
escape of the single quotes. however, the spec
(http://www.w3.org/TR/n-quads/) indicates that single quotes should be
escaped in literal strings ([153s]ECHAR::='\' [tbnrf"'\]). could you
look into this for me?

m.


Warning: Could not parse triple at line 2787490, ignored and not processed.
Line: <http://bio2rdf.org/affymetrix:264745_at>
<http://bio2rdf.org/affymetrix_vocabulary:gene-title> "APR2
(5\'ADENYLYLPHOSPHOSULFATE REDUCTASE 2); adenylyl-sulfate reductase/
phosphoadenylyl-sulfate reductase
(thioredoxin)"^^<http://www.w3.org/2001/XMLSchema#string>
<http://bio2rdf.org/affymetrix_resource:bio2rdf.dataset.affymetrix.R3>
.'

Error: Unescaped backslash in: "APR2 (5\'ADENYLYLPHOSPHOSULFATE
REDUCTASE 2); adenylyl-sulfate reductase/ phosphoadenylyl-sulfate
reductase (thioredoxin)"^^<http://www.w3.org/2001/XMLSchema#string>
<http://bio2rdf.org/affymetrix_resource:bio2rdf.dataset.affymetrix.R3>
Warning: Could not parse triple at line 2787490, ignored and not processed.

[1] http://download.bio2rdf.org/release/3/affymetrix/affymetrix.nq.gz

Mario Arias

unread,
Aug 19, 2014, 1:20:38 PM8/19/14
to Michel Dumontier, bio2rdf
Dear Michel,

First thanks for your interest and taking the time to try HDT.

Can you please specify which package are you using? C++ command line, HDT-it!, HDT Java? Which version?

In the meantime I converted affymetrix for you. Just paste the attached snippet in a Unix terminal to download the dataset/software and launch a SPARQL endpoint. (On windows you can follow the same steps manually).

If you have any issue/question do not hesitate to ask, I’ll be happy to help.

Best regards,

Mario.


—————————8<——————————

# Download Affymetrix in HDT format (251Mb hdt.gz, vs 1,5Gb nq.gz and 1,6Gb Virtuoso.gz)
curl http://gaia.dcs.fi.uva.es/hdt/bio2rdf-affymetrix.hdt.gz | gzip -cd > bio2rdf-affymetrix.hdt

# Download HDT Java software

# Launch Endpoint (The first time it takes some time to generate an .hdt.index)
hdt-fuseki-1.1.1-SNAPSHOT/bin/hdtEndpoint.sh --hdt bio2rdf-affymetrix.hdt /affymetrix

# SPARQL Endpoint ready at http://localhost:3030

Samy Nathan

unread,
Sep 13, 2016, 12:46:44 PM9/13/16
to bio2rdf, mario...@deri.org
Hi mario,

iam following your steps

1. index created hdt file

One-time index file creation, please be patient ...
Generating /home/bigtapp/Documents/HDTSPARQL/timedataset758593.hdt.index
Could not read .hdt.index, Generating a new one.
Predicate Bitmap in 114 ms 865 us
Count predicates in 210 ms 747 us
Count Objects in 90 ms 308 us Max was: 45745
Bitmap in 13 ms 756 us
Object references in 279 ms 548 us
Sort object sublists in 204 ms 430 us
Count predicates in 352 ms 637 us
Index generated in 941 ms 595 us
11:14:00 INFO HDT dataset: file=/home/bigtapp/Documents/HDTSPARQL/timedataset758593.hdt
11:14:01 INFO Dataset path = /newdatasethdt
11:14:01 INFO Fuseki 1.0.0 2013-09-12T10:49:49+0100
11:14:01 INFO Started 2016/09/13 11:14:01 IST on port 3030


iam checking localhost:3030

not work fuski






Reply all
Reply to author
Forward
0 new messages