error importing n3 file

34 views
Skip to first unread message

Mathieu Saby

unread,
Oct 9, 2018, 9:07:15 AM10/9/18
to OpenRefine
Hi
In Openrefine v3 I tried to import a n3 file, and I got this error message : "line 11977, col 5 ] Triples not terminated by DOT"
I am not a semantic web expert, so could you tell me if it is a bug, or if Openrefine is wrong?
The file comes from french national library (http://data.bnf.fr/semanticweb : http://data.bnf.fr/11928016/jules_verne/rdf.n3  ), and I could convert it without error in https://rdf-translator.appspot.com/

The issue seems to come from this line :

Ettore Rizza

unread,
Oct 9, 2018, 10:54:25 AM10/9/18
to OpenRefine
Hello Mathieu, 

The RDF parser was modified a few months ago, but I think there are still some errors. In my opinion, the format guesser thinks it is a Turtle file, not an n3 (the = syntactic sugar operator is perfectly valid in n3, but not in Turtle.) The best, for now, is probably to export your files in another RDF format, for example N-triple : http://data.bnf.fr/11928016/jules_verne/rdf.nt (OpenRefine is able to parse N-triples, even if the menu mentions only RDF/N3)

Mathieu Saby

unread,
Oct 9, 2018, 4:27:35 PM10/9/18
to OpenRefine
Thanks
I have seen the github issue, that's why I wanted to test it, to know if I should show that feature in a training session. Hem hem, it seems I should NOT for the moment.
And what about json-ld?

Ettore RIZZA

unread,
Oct 9, 2018, 4:43:28 PM10/9/18
to openrefine
Json-ld is basically a Json, so you can try to parse it with OpenRefine.

{
  "@context": {
    "ex:contains": {
      "@type": "@id"
    }
  },
  "@graph": [
    {
      "@id": "http://example.org/library",
      "@type": "ex:Library",
      "ex:contains": "http://example.org/library/the-republic"
    },
    {
      "@type": "ex:Book",
      "dc:creator": "Plato",
      "dc:title": "The Republic",
    },
    {
      "@type": "ex:Chapter",
      "dc:description": "An introductory chapter on The Republic.",
      "dc:title": "The Introduction"
    }
  ]
}

But there is still a lot of work to do to correctly parse Json or XML. I usually try to convert these formats to CSV in other ways before importing the result into OR.

--
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/t67D8_JrUs0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mathieu Saby

unread,
Oct 9, 2018, 5:19:35 PM10/9/18
to OpenRefine
XML/RDF is a dialect of XML, but has its own place in the import menu and is interpreted in a special way, not like an ordinary file. If you import XML/RDF, the result in Openrefine is the same as a N3 file. So it the same RDF data is expressed in JSON-LD, I think we should have the ability to import them as semantic data, not as a raw JSON file, and get the same result as a XML/RDF or a N3 file.

Owen Stephens

unread,
Oct 10, 2018, 11:08:23 AM10/10/18
to OpenRefine
At first glance I don't think adding JSON-LD would be a big problem - Mathieu: could you add a github issue for it?

Looking at the N3 issue - I'm finding it hard to understand exactly what Jena (the code library we use to read RDF formats) supports in relation to n3, but this page http://jena.apache.org/documentation/io/index.html says ".n3 is supported but only as a synonym for Turtle."

If this is the case, then this would explain why it is unable to read the file which contains syntax that is part of n3 but not ttl.

This would mean that currently it will only work with a subset of n3. If we wanted to support n3 in full we'd have to implement an alternative parser for n3

Owen

Thad Guidry

unread,
Oct 10, 2018, 11:01:17 PM10/10/18
to openr...@googlegroups.com
Owen is right about Riot Parser from Jena.
(Yes Owen, we probably need a better parser for dealing with .n3 and perhaps just calling Python and using RDFlib to do the heavy lifting might be the wisest choice...its been hit and miss with Java libraries around this specific need, but regular RDF/XML or triples is usually not a problem compared to notation3 (n3) format which is a NON-XML format.)

Mathieu, if you want to deal with your n3 file, then you I would suggest to use a small Python script using RDFlib  https://rdflib.readthedocs.io/en/latest/index.html

3. Craft your script using g.parse() and g.serialize() such as in this example https://stackoverflow.com/a/6180830/1100717
 
import rdflib
from rdflib.graph import Graph
g = Graph()
# format="nt")
g.serialize("test.rdf", format="rdf/xml")

4. load in the rdf/xml file that you created from step 3 into OpenRefine
5. Done

Hope this helps you quicker,

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages