IRI syntax: first character after "#"

Aqualung

unread,

Sep 13, 2016, 1:02:52 PM9/13/16

to TopBraid Suite Users

I seem to recall an unwritten rule that says that one should not start the local part of an IRI with a digit (i.e. no digit immediately after "#" (assuming one uses "#" as a fragment delimiter, that is)). I am not sure where this come from, nor did I bother questioning it at the time, so I pretty much followed it blindly. Is there such a rule, and if there is, what would be its source?

Thank you.

Jim Balhoff

unread,

Sep 13, 2016, 2:26:52 PM9/13/16

to topbrai...@googlegroups.com

I think this comes from restrictions on XML IDs: http://www.w3.org/TR/xml11/#id

Richard Cyganiak

unread,

Sep 14, 2016, 5:24:25 AM9/14/16

to topbrai...@googlegroups.com

Hi,

This dates back to the days when RDF/XML was the only way to serialise RDF.

In RDF/XML, the URIs of predicates had to be abbreviated as namespaced XML element names. So, for example, if you used the foaf:name predicate (http://xmlns.com/foaf/0.1/name) in one of your triples, you *had* to write it as an XML element such as <foaf:name>Richard</foaf:name>.

The problem was that the local part of an XML element name is not allowed to start with a digit. It has to be a letter or underscore. Thus, it was impossible to write predicate URIs where the local name starts with a digit. The result would be invalid XML.

Note that this restriction only ever applied to predicate URIs. Class URIs such as foaf:Person also were typically written as XML elements in RDF/XML, but there was also a way to write out the full URI. And the URIs of other resources were always written out in their full form, so there was no restriction on the first character of the local name.

But many users of RDF (like you) never grasped these details, and just follow the “unwritten rule”.

In syntaxes popular today, like N-Triples and Turtle and JSON-LD, it’s possible to write down predicate URIs where the local name starts with a digit, so the restriction should really be a thing of the past. But it probably has been baked into many tools and software systems that process RDF, not to mention the minds of users!

My recommendation for today would be to avoid digits as the first character in local names of *classes* and *properties*. Not only because of RDF/XML, but because names of classes and properties often end up as identifiers in other formats and software systems. For example, your property names might end up as field names in an OO model or as column names in a database schema, and initial digits will cause trouble in these contexts too.

But there’s no reason to avoid initial digits in local names for *all* resources. The idea that they need to be avoided for all resources was always a misconception.

Best,

Richard

On 13 Sep 2016, at 18:02, Aqualung <cris...@gmail.com> wrote:

I seem to recall an unwritten rule that says that one should not start the local part of an IRI with a digit (i.e. no digit immediately after "#" (assuming one uses "#" as a fragment delimiter, that is)). I am not sure where this come from, nor did I bother questioning it at the time, so I pretty much followed it blindly. Is there such a rule, and if there is, what would be its source?

Thank you.

--
You received this message because you are subscribed to the Google Group "TopBraid Suite Users", the topics of which include the TopBraid Suite family of products and its base technologies such as SPARQLMotion, SPARQL Web Pages and SPIN.
To post to this group, send email to topbrai...@googlegroups.com
---
You received this message because you are subscribed to the Google Groups "TopBraid Suite Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to topbraid-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward