N-Quads parsing

26 views
Skip to first unread message

Omid Maldar

unread,
Nov 16, 2016, 10:16:30 AM11/16/16
to EasyRdf Discussion
is it possible to parse n-quads files using EasyRDF? I'm wondering how may I use these kinds of datasets: http://webdatacommons.org/structureddata/2015-11/stats/how_to_get_the_data.html
thanks

Nicholas Humfrey

unread,
Nov 18, 2016, 8:09:17 AM11/18/16
to eas...@googlegroups.com, Omid Maldar
Hello Omid,

No, there isn't currently support for N-Quads in EasyRDF.

Triples = [Subject, Property, Object]
Quads = [Subject, Property, Object, Graph]

The EasyRDF parsers currently have a fairly tight coupling with the
Graph object and don't support loading a document into multiple graphs.
Also, EasyRdf isn't very well suited to loading very large data sets -
it works much better with manipulating small documents.

I am not sure exactly what you are trying to achieve but one possibility
would be to pre-process the files and extract the sub-set of data that
you are interested in. Because of the simplicity of the n-quads format,
this could be done by processing one line at a time using PHP, perl, awk
etc.

Given the very large size of the datasets (nearly 500GB in total) you
may even want to look at something like Hadoop or Amazon EMR to
query/process/extract the dataset quickly.



nick.
> --
> You received this message because you are subscribed to the Google
> Groups "EasyRdf Discussion" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to easyrdf+u...@googlegroups.com.
> To post to this group, send email to eas...@googlegroups.com.
> Visit this group at https://groups.google.com/group/easyrdf.
> For more options, visit https://groups.google.com/d/optout.

Omid Maldar

unread,
Nov 18, 2016, 8:29:24 AM11/18/16
to EasyRdf Discussion

Thank you very much. yes, I'm thinking the same way to extract a subset of data.
Reply all
Reply to author
Forward
0 new messages