These are defined in
http://www.w3.org/TR/2012/rdf11-concepts/#section-XMLLiteral
http://www.w3.org/TR/2012/rdf11-concepts/#section-html
The current RDFLib does not implement equality for XML Literal datatypes (which is understandable, the old spec required a string level comparison after canonicalization). The RDF 1.1 changes the XML Literal equality notion, it is now based on DOM 3 method, namely the usage of the Dom3 isEqualNode() method.
Unfortunately, the DOM environments distributed in the Python core libraries do not implement isEqualNode(). To make things worse, RDF 1.1 has introduced a new datatype, rdf:HTML, which is very similar to XML Literals, except based on HTML5 syntax; on the other hand, html5lib does not implement isEqualNode.
That being said, implementation of isEqualNode is not terribly complicated using other DOM methods. This makes it possible to properly add XML and HTML Literals, ie, their equivalence. So I did that.
Here is what I did:
- added rdf:HTML to _RDFNamespace in rdflib/namespace.py
- added three classes to rdflib/term.py:
class XMLOrHTMLLiteral(Literal)
class XMLLiteral(XMLOrHTMLLiteral)
class HTMLLiteral(XMLOrHTMLLiteral)
A literal automatically gets the right (Python) type by a modified Literal constructor; XMLOrHTMLLiteral contains a bunch of common methods (all numerical methods like __add_ to raise a 'not a number' exception, etc), and also a common (static) _isEqualNode method.
The XMLLiteral and HTML classes redefine _toCompareValue by generating a DOM tree for the top node, using the respective parsers (minidom and html5lib) and the __eq__ makes the right comparison if all types are right.
I have also added HTMLLiteral and XMLLiteral to the __init__.py file to get the terms directly imported
Ivan
----
Ivan Herman
4, rue Beauvallon, clos St Joseph
13090 Aix-en-Provence
France
http://www.ivan-herman.net