Input Turtle Validation

7 views
Skip to first unread message

Tim McIver

unread,
Jan 13, 2020, 11:05:07 PM1/13/20
to trell...@googlegroups.com
I'm not familiar with Trellis' code base so I'm not sure what validation is done on inputs but I wanted to report a problem I ran into that may be related to input validation.

I created some triples in Trellis by posting RDF in Turtle format to a container.  The triples were created successfully but when viewing them in the UI I noticed that the slash was missing between the prefix namespace and the term.  For example, a triple would look like this:

<http://timmciver.com/card#me> <http://xmlns.com/foaf/0.1name> "Jane Doe"@en .

(Notice the missing slash between "0.1" and "name" in the predicate.) I thought this might have been a UI display issue and didn't think much of it until I saw the same thing in my application too.  I finally discovered it was because I failed to include the trailing slash on a @prefix I had in my Turtle file.

Is this something that is supposed to be validated?  If not, should it be?  My thinking is that it should be validated since it causes Trellis to return bad data.  But I'm not sure if this is a good idea or even possible.  I suppose it could be done if all prefixes are expected to end with a '/' or '#' but I'm not sure if that is part of the Turtle spec.


Aaron Coburn

unread,
Jan 14, 2020, 10:27:06 AM1/14/20
to trell...@googlegroups.com
Hello Tim,
When it comes to RDF, validation means several different things. There is syntactical validation (and the triple you included in your message is syntactically correct). Trellis currently does syntactic validation: if you POST syntactically invalid RDF, the request is rejected with a 400 Bad Request. There is also SHACL and ShEx validation that work on the semantics of the RDF. Trellis does not implement this, but adding this is possible; more on that below.

For syntactically valid RDF, Trellis will accept whatever triples are sent by a client and persist those. So, if a client POSTs syntactically valid (but semantically invalid, as in your case), Trellis generally won't do any manipulation, checking or validation on those triples. And as a note, Turtle prefixes are not required to end in a / or # character; the Turtle grammar for prefixes requires only an IRIREF. Though you are correct, it is certainly common for prefix definitions to end in a / or # (OTOH, I know of certain patterns where that is not the case, especially for referencing the 'current document' as in @prefix : <> . ).

That said, there is a mechanism for validating the incoming RDF in Trellis. In fact, there is some minimal validation that happens to RDF, but only based on what LDP formally requires (e.g. ldp:membershipResource must point to an in-domain IRI -- it can't be a literal or any arbitrary IRI; ldp:hasMemberRelation must point to an IRI, which can't be ldp:contains; there are a few such rules). Those validation rules are all part of a class that implements Trellis' ConstraintService API. The Trellis code is structured in such a way that any number of ConstraintService implementations can be injected into the runtime. The deployable artifacts that you see with Trellis releases contain only the trellis-constraint-rules artifact (i.e. the minimal LDP validation rules), but if one wished to do any sort of more complex validation (including SHACL or ShEx), that would be the place to do it. That would also mean writing some Java code and building your own deployable artifact, though technically, you could just drop a jar file into the classpath of your deployed Trellis and it should work.

I would point out, though, that issue that motivated this comment (a missing trailing slash in a prefix) is actually rather tricky. The IRI that is produced is a valid IRI and hence it is syntactically valid Turtle, and so the only real way to catch it would be to have a SHACL/ShEx rule that explicitly requires the foaf:name triple in the resource; your resource does not contain foaf:name and so the SHACL/ShEx validator would produce an error.

Hope that helps,
Aaron


--
You received this message because you are subscribed to the Google Groups "Trellis LDP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trellis-ldp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/trellis-ldp/5c8ab9f3-4932-fe33-3793-d59c4baecc1c%40timmciver.com.
Reply all
Reply to author
Forward
0 new messages