Why are contexts stored as graphs?

13 views
Skip to first unread message

scossu

unread,
Feb 18, 2018, 8:43:20 PM2/18/18
to rdflib-dev
Hello,
I am working on an alternate back end implementation of the Store interface that uses LMDB as its persistence layer: https://github.com/scossu/lakesuperior/blob/lmdb_strategy5/lakesuperior/store_layouts/ldp_rs/lmdb_store.py

I noticed that the Dataset class passes a Graph instance to the Store when handling contexts: http://rdflib.readthedocs.io/en/stable/_modules/rdflib/graph.html#Dataset.graph

The Sleepycat store seems to support this too.

I am not sure about the reason why the context has to be the whole graph rather than a URI reference (i.e. the context graph identifier). This increases storage size and seems to make the implementation of both the back end and the upstream code consuming Graph and Dataset instances more complex.

I guess there is a reason for this approach. Can someone explain the rationale? Would it be otherwise OK to just strip the identifier off the graph and only store that (which would introduce some back and forth conversion efforts but it would keep things consistent)?

Thank you.

Gunnar Aastrand Grimnes

unread,
Feb 19, 2018, 3:13:32 AM2/19/18
to rdfli...@googlegroups.com
That's a good question, and I don't think anyone who still maintains
RDFLib actually knows :)

I wanted to fix it, but never got it out the door:
https://github.com/RDFLib/rdflib/pull/409

(4 years ago :see_no_evil:)

- Gunnar
> --
> http://github.com/RDFLib
> ---
> You received this message because you are subscribed to the Google Groups
> "rdflib-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to rdflib-dev+...@googlegroups.com.
> To post to this group, send email to rdfli...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/rdflib-dev/66ac519b-0b54-4997-bc1d-d23353be282a%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
http://gromgull.net

Stefano Cossu

unread,
Feb 19, 2018, 9:28:00 AM2/19/18
to rdfli...@googlegroups.com, Gunnar Aastrand Grimnes
Thanks Gunnar. I started adapting my implementation to this behavior:
https://github.com/scossu/lakesuperior/blob/store_ctx_as_uri/lakesuperior/store_layouts/ldp_rs/lmdb_store.py

But the SPARQL query breaks. I believe it is because the contexts()
method not only is expected to return a list of graphs rather than URIs,
but also these are expected to be graphs that can be queried for
contents and return the triples in the store contained in that context
(at least they are in the Sleepycat implementation).

I can try to add this change to my store implementation, or just store
graphs as Sleepycat does, but it feels I am propagating the odd
behavior... this may be especially problematic if I inadvertently pass a
very large graph as a context, and the whole of it gets stored as a
giant pickle.

Any chance that your PR could be merged, or that some clarity is made
about the Store interface and its methods?

Thanks,
Stefano
Stefano Cossu
Director of Application Services, Collections

The Art Institute of Chicago
116 S. Michigan Ave.
Chicago, IL 60603
312-499-4026

Reply all
Reply to author
Forward
0 new messages