Failed Import N3 triples because of connection reset by peer

87 views
Skip to first unread message

Eason

unread,
May 14, 2012, 6:50:05 AM5/14/12
to 4store-support
Hello there,

I'm trying to load a 2G N3 file to 4sr Stable Version 1.0(4Store with
RDFS inference). Because the file is too large I split it into 178
small files and load one file at a time using the command 4s-import -v
demo --format n3 --add --model modelname filename.

In the process some of the files uploaded successfully. Because a few
failed because of error:

Reading <file:///home/ryan/yagosegs/162>
into <rolo>
URI rolo:100660 raptor fatal error - turtle_copy_string_token failed
URI rolo:100660 raptor error - syntax error
4store[6035]: import.c:397 failed to parse file “file:///home/ryan/
yagosegs/162”
Pass 1, processed 100654 triples (100654)
Pass 2, processed 100654 triples, 1229 triples/s

Look into the original file shows that
<Jamaica_Plain,_Boston> rdfs:label
"\u00b7\ud801\udc61\ud801\udc69\ud801\udc65\ud801\udc71\ud801\udc52\ud801\udc69
\u00b7\ud801\udc50\ud801\udc64\ud801\udc71\ud801\udc6f" .
causes the problem. I've loaded the same file to other rdf stores and
never saw any syntax problem with the file. I just don't get what the
syntax error can be?

Beside, I also get lots of connection problems like:

Reading <file:///home/ryan/yagosegs/110>
into <rolo>
Pass 1, processed 200000 triples (200000)
Pass 2, processed 200000 triples, 43092 triples/s
Updating index
4store[6674]: 4s-client.c:636 kb=demo stop_import(1) failed: no reply
4store[6674]: 4s-client.c:636 kb=demo stop_import(0) failed: no reply
Index update took 0.819361 seconds
Imported 200000 triples, average 38558 triples/s

and also Connection Reset By Peer.

Any idea what is happening? I'm running 4Store on Ubuntu 11.04 and as
a single machine.

Thank you very much.

Steve Harris

unread,
May 14, 2012, 8:45:20 AM5/14/12
to 4store-...@googlegroups.com
On 2012-05-14, at 11:50, Eason wrote:

> Hello there,
>
> I'm trying to load a 2G N3 file to 4sr Stable Version 1.0(4Store with
> RDFS inference). Because the file is too large I split it into 178
> small files and load one file at a time using the command 4s-import -v
> demo --format n3 --add --model modelname filename.
>
> In the process some of the files uploaded successfully. Because a few
> failed because of error:
>
> Reading <file:///home/ryan/yagosegs/162>
> into <rolo>
> URI rolo:100660 raptor fatal error - turtle_copy_string_token failed
> URI rolo:100660 raptor error - syntax error
> 4store[6035]: import.c:397 failed to parse file “file:///home/ryan/
> yagosegs/162”
> Pass 1, processed 100654 triples (100654)
> Pass 2, processed 100654 triples, 1229 triples/s
>
> Look into the original file shows that
> <Jamaica_Plain,_Boston> rdfs:label
> "\u00b7\ud801\udc61\ud801\udc69\ud801\udc65\ud801\udc71\ud801\udc52\ud801\udc69
> \u00b7\ud801\udc50\ud801\udc64\ud801\udc71\ud801\udc6f" .
> causes the problem. I've loaded the same file to other rdf stores and
> never saw any syntax problem with the file. I just don't get what the
> syntax error can be?

Interesting - this looks like a bug in raptor, the RDF parser we use. I get the same error when trying it with rapper:

$ rapper -i turtle /tmp/foo.ttl
rapper: Parsing URI file:///tmp/foo.ttl with parser turtle
rapper: Serializing with serializer ntriples
rapper: Error - URI file:///tmp/foo.ttl:1 - turtle_copy_string_token failed
rapper: Error - URI file:///tmp/foo.ttl:2 - syntax error

I'll report it upstream.

> Beside, I also get lots of connection problems like:
>
> Reading <file:///home/ryan/yagosegs/110>
> into <rolo>
> Pass 1, processed 200000 triples (200000)
> Pass 2, processed 200000 triples, 43092 triples/s
> Updating index
> 4store[6674]: 4s-client.c:636 kb=demo stop_import(1) failed: no reply
> 4store[6674]: 4s-client.c:636 kb=demo stop_import(0) failed: no reply
> Index update took 0.819361 seconds
> Imported 200000 triples, average 38558 triples/s
>
> and also Connection Reset By Peer.
>
> Any idea what is happening? I'm running 4Store on Ubuntu 11.04 and as
> a single machine.

That last one looks like the backend is crashing, or aborting - can you look in /var/log/messages and see if there's a report?

Manuel might need to look at that one, I'm not really familiar with how 4sr works.

- Steve

--
Steve Harris, CTO
Garlik, a part of Experian
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203 http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ

Manuel Salvadores

unread,
May 14, 2012, 1:41:44 PM5/14/12
to 4store-...@googlegroups.com
Hi,

are you running the raptor library latest version ?

Manuel
> --
> You received this message because you are subscribed to the Google Groups "4store-support" group.
> To post to this group, send email to 4store-...@googlegroups.com.
> To unsubscribe from this group, send email to 4store-suppor...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/4store-support?hl=en.
>
Reply all
Reply to author
Forward
0 new messages