Turtle-star roundtripping

11 views
Skip to first unread message

Nicholas Car

unread,
Aug 10, 2022, 12:37:29 AM8/10/22
to rdflib-dev
Hi RDFLib developers,

My student Xugang Song (Song) has just demoed to me Turtle-star -> RDFLib Graph -> Turtle-star deserialisation, serialisation roundtrip, for all of the approved Turtle-star test cases [1]. His work is buried deep in the repo at [2].

We are considering what he might do with his remaining student time (2 months) but, whatever else he does, he will contribute his final work as a neat PR to RDFLib.

If you have a strong request for what he might do - perhaps other format conversions or benchmarking of RDFLib's implementation v. others - please le me know on this mailing list ASAP.

Regards, Nick

Iwan Aucamp

unread,
Aug 15, 2022, 3:38:55 PM8/15/22
to rdflib-dev

Hi Nick

On Wednesday, 10 August 2022 at 06:37:29 UTC+2 ni...@kurrawong.net wrote:
If you have a strong request for what he might do - perhaps other format conversions or benchmarking of RDFLib's implementation v. others - please le me know on this mailing list ASAP.

Some things which need some attention which may be appropriate:
  • Parsers: we could really do with some other generated parsers (e.g. LARK). We have quite a few of known problems with the existing parsers, some of them are captured with xfails, others with issues. Some are outright problems where we don't accept valid input but more are cases where we are too lax and accept input that we should not, these should ideally be fixed by having a fairly lax grammar that we then actually interpret differently based on laxness, but even just some basic lark parsers with max strictness next to existing parsers may be a good start, we could potentially mark them as _private at first until we are clear on how we want to expose them in the public interface.
  • JSON-LD testing: I recently overhauled all official W3C test suites except JSON-LD, this highlighted some misreporting in previous reports, and it also provides a much nicer EARL reporting which now runs on every test run and makes it easier to notice if the report output changes. This is however not being used for JSON-LD yet, but I also don't think we are running all JSON-LD tests. We need to improve JSON-LD, and we talked about that in previous issues, but before we do this I think we need to be very clear on our current JSON-LD compliance so we can quantify the effect of incorporating pyld.
  • JSON-LD support: We need better JSON-LD support, but this is also no small task, and probably the starting point is the test suite so this is somewhat stacked on the previous item.
  • SPARQL W3C testing: We are not running all tests from the W3C test for SPARQL, specifically we are skipping ServiceDescriptionTest and ProtocolTest, it would be nice to also run them.
  • SNYK-PYTHON-RDFLIB-1324490 vulnerability: we have this issue (#1844) that is a bit of a impediment to people using RDFLib, may be worth looking at it, we had a PR (linked in the issue itself), but the PR is a bit hit and miss, it can be salvaged but it needs quite a bit of work and the problem of URI resolution is quite broad and should possibly cover things like resolving to filesystem also so we can use it instead of URIMapper in our test suite also.
  • Performance testing: It would be nice to have some automated performance testing that we can run on every PR so we can quantify the performance impact PRs have.

Regards
Iwan Aucamp 

Nicholas Car

unread,
Aug 15, 2022, 9:06:54 PM8/15/22
to rdfli...@googlegroups.com
Hi Iwan,

That's a good list of general​ thinks for RDFLib but I was asking for particular things focussed on RDFLib + RDF-star. As a student, Song has to do new investigative things, not just engineering enhancements to something, as useful as they may be.

So let's keep the ptioroty order you've indicated here - I agree with it all: systematic parsers, better tests etc. - but talk again about the RDF-star work.

Nick

------- Original Message -------
--
http://github.com/RDFLib
---
You received this message because you are subscribed to the Google Groups "rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdflib-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rdflib-dev/505a1021-557d-4613-8adf-cdb9c2201d4an%40googlegroups.com.

Iwan Aucamp

unread,
Aug 16, 2022, 2:44:40 AM8/16/22
to rdflib-dev
Hi Nick,

On Tuesday, 16 August 2022 at 03:06:54 UTC+2 ni...@kurrawong.net wrote:
That's a good list of general​ thinks for RDFLib but I was asking for particular things focussed on RDFLib + RDF-star. As a student, Song has to do new investigative things, not just engineering enhancements to something, as useful as they may be.

So let's keep the ptioroty order you've indicated here - I agree with it all: systematic parsers, better tests etc. - but talk again about the RDF-star work.

Specifically RDF star related it would be nice to have N-Quads-star also, N-Triples-star and TriG-star would also be nice if Turtle-star is a superset of N-Triples-star it is not that essential, and N-Quads-star will likely be simpler than TriG-star.

Other than that it would be nice to see conformance testing with the Turtle-star test suite. Performance testing would also be nice but may involve quite a bit of complexity, if we do performance testing it would be nice to have it done in a way that we can integrate into the CI pipeline and I guess ideally it should just be checking the performance of the parser/serializer and avoid/sidestep our store implementations, but comparing it to other implementations will likely not tell us that much, as it is not likely to have any performance advantages over most other languages.

Not sure that is very helpful, will keep it in mind and respond if I can think of more.

Regards
Iwan Aucamp


Miel Vander Sande

unread,
Aug 18, 2022, 7:40:48 AM8/18/22
to rdfli...@googlegroups.com
Hi Nick,

In context of using rdflib in data pipelines and such, a binary serialisation such as Jena's https://jena.apache.org/documentation/io/rdf-binary or RDF4J's https://rdf4j.org/documentation/reference/rdf4j-binary/ (supports RDF-star) would be very welcome. Support for SPARQL-star is another one (unless this already exists?) 

Best,

Miel 

Op di 16 aug. 2022 om 08:44 schreef Iwan Aucamp <auca...@gmail.com>:
--
http://github.com/RDFLib
---
You received this message because you are subscribed to the Google Groups "rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdflib-dev+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages