Getting `\t` to output as-is (not as a TAB character) in serializations

8 views
Skip to first unread message

Donny Winston

unread,
Feb 23, 2022, 10:23:01 AM2/23/22
to rdflib-dev

I want to store LaTeX representations as Literal values.

Without escaping (with “\”), I get what I want with JSON-LD (e.g. “\text”) but not with Turtle et al. (I get “ext”).

With escaping, I get double slashes (e.g. “\text”) with JSON-LD, Turtle, and N-Triples, but I get what I want (“\text”) with XML.

Any suggestions? I’m not sure if there’s a bug here or if there’s a workaround I can do.

Here’s a reproduction of my problem:

from rdflib import Graph, URIRef, RDFS, Literal

g = Graph()
g.add((
    URIRef("http://example.com/node"),
    RDFS.comment,
    Literal("SHAPE: $n_{\text{sites}} \times 3 \times 3$")
))
g.add((
    URIRef("http://example.com/node"),
    RDFS.comment,
    Literal("SHAPE: $n_{\\text{sites}} \\times 3 \\times 3$")
))
print("JSON-LD:")
print(g.serialize(format="application/ld+json"))
print("\nTurtle:")
print(g.serialize(format="text/turtle"))
print("XML:")
print(g.serialize(format="application/rdf+xml"))
print("N-Triples:")
print(g.serialize(format="nt"))
JSON-LD:
[
  {
    "@id": "http://example.com/node",
    "http://www.w3.org/2000/01/rdf-schema#comment": [
      {
        "@value": "SHAPE: $n_{\text{sites}} \times 3 \times 3$"
      },
      {
        "@value": "SHAPE: $n_{\\text{sites}} \\times 3 \\times 3$"
      }
    ]
  }
]

Turtle:
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<http://example.com/node> rdfs:comment "SHAPE: $n_{        ext{sites}}         imes 3         imes 3$",
        "SHAPE: $n_{\\text{sites}} \\times 3 \\times 3$" .

XML:
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>
  <rdf:Description rdf:about="http://example.com/node">
    <rdfs:comment>SHAPE: $n_{        ext{sites}}         imes 3         imes 3$</rdfs:comment>
    <rdfs:comment>SHAPE: $n_{\text{sites}} \times 3 \times 3$</rdfs:comment>
  </rdf:Description>
</rdf:RDF>

N-Triples:
<http://example.com/node> <http://www.w3.org/2000/01/rdf-schema#comment> "SHAPE: $n_{        ext{sites}}         imes 3         imes 3$" .
<http://example.com/node> <http://www.w3.org/2000/01/rdf-schema#comment> "SHAPE: $n_{\\text{sites}} \\times 3 \\times 3$" .

screenshot of pretty-printed output: https://files.polyneme.xyz/dropshare/2022-02-23-rdflib-dev-pprint-UM2FfmQBDv.png.

Donny Winston

unread,
Feb 23, 2022, 10:26:46 AM2/23/22
to rdflib-dev

Looks like I got unwanted escaping in my description! Trying again:

Without escaping (with “\”), I get what I want with JSON-LD (e.g. “<slash>text”) but not with Turtle et al. (I get “<TAB>ext”).

With escaping, I get double slashes (e.g. “<slash><slash>text”) with JSON-LD, Turtle, and N-Triples, but I get what I want (“<slash>text”) with XML.

Nicholas Car

unread,
Feb 25, 2022, 6:33:33 AM2/25/22
to rdfli...@googlegroups.com
Just checked that Python r"" string with no escaping works the same as escaping with "\": it does.

Also, HexT serialization works fine and just like JSON-LD. No surprise as they are both using the json package.

So, there really are differences in the serializers' escaping. No surprise: XML & JSON etc are really different!

It's perhaps unfair to call this a bug but, I guess, we would be interested in someone harmonising results across all serialisations...

--
http://github.com/RDFLib
---
You received this message because you are subscribed to the Google Groups "rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdflib-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rdflib-dev/7b066616-1f13-4b94-8453-e4948926f07fn%40googlegroups.com.

Donny Winston

unread,
Feb 25, 2022, 10:25:15 AM2/25/22
to JB
Hmm. I guess if serialization differences are not just a Python/rdflib thing and would be expected across any platform (i.e. aren't implied by any W3C recommendation), then there's no clear interoperability benefit to harmonizing.

Also, TIL about HexT. Thanks!

--
Donny Winston, PhD (he/him/his) | Polyneme LLC

If I've emailed you, I'd love to speak with you.
Schedule a meeting (15min+): https://meet.polyneme.xyz

Reply all
Reply to author
Forward
0 new messages