On 2012-05-14, at 11:50, Eason wrote:
> Hello there,
>
> I'm trying to load a 2G N3 file to 4sr Stable Version 1.0(4Store with
> RDFS inference). Because the file is too large I split it into 178
> small files and load one file at a time using the command 4s-import -v
> demo --format n3 --add --model modelname filename.
>
> In the process some of the files uploaded successfully. Because a few
> failed because of error:
>
> Reading <file:///home/ryan/yagosegs/162>
> into <rolo>
> URI rolo:100660 raptor fatal error - turtle_copy_string_token failed
> URI rolo:100660 raptor error - syntax error
> 4store[6035]: import.c:397 failed to parse file “file:///home/ryan/
> yagosegs/162”
> Pass 1, processed 100654 triples (100654)
> Pass 2, processed 100654 triples, 1229 triples/s
>
> Look into the original file shows that
> <Jamaica_Plain,_Boston> rdfs:label
> "\u00b7\ud801\udc61\ud801\udc69\ud801\udc65\ud801\udc71\ud801\udc52\ud801\udc69
> \u00b7\ud801\udc50\ud801\udc64\ud801\udc71\ud801\udc6f" .
> causes the problem. I've loaded the same file to other rdf stores and
> never saw any syntax problem with the file. I just don't get what the
> syntax error can be?
Interesting - this looks like a bug in raptor, the RDF parser we use. I get the same error when trying it with rapper:
$ rapper -i turtle /tmp/foo.ttl
rapper: Parsing URI file:///tmp/foo.ttl with parser turtle
rapper: Serializing with serializer ntriples
rapper: Error - URI file:///tmp/foo.ttl:1 - turtle_copy_string_token failed
rapper: Error - URI file:///tmp/foo.ttl:2 - syntax error
I'll report it upstream.
> Beside, I also get lots of connection problems like:
>
> Reading <file:///home/ryan/yagosegs/110>
> into <rolo>
> Pass 1, processed 200000 triples (200000)
> Pass 2, processed 200000 triples, 43092 triples/s
> Updating index
> 4store[6674]: 4s-client.c:636 kb=demo stop_import(1) failed: no reply
> 4store[6674]: 4s-client.c:636 kb=demo stop_import(0) failed: no reply
> Index update took 0.819361 seconds
> Imported 200000 triples, average 38558 triples/s
>
> and also Connection Reset By Peer.
>
> Any idea what is happening? I'm running 4Store on Ubuntu 11.04 and as
> a single machine.
That last one looks like the backend is crashing, or aborting - can you look in /var/log/messages and see if there's a report?
Manuel might need to look at that one, I'm not really familiar with how 4sr works.
- Steve
--
Steve Harris, CTO
Garlik, a part of Experian
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203 http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ