Why ISO-8859-1 charset from sparql.json endpoint?

167 views
Skip to first unread message

Christopher Johnson

unread,
Mar 26, 2016, 9:50:54 AM3/26/16
to Getty Vocabularies as Linked Open Data
Is there a reason why the Content-Type header from the http://vocab.getty.edu/sparql.json endpoint returns:
application/sparql-results+json;charset=ISO-8859-1?

I have noticed a similar problem at http://collection.britishmuseum.org/sparql.json.

If conneg is not supported, then the json endpoint should definitely return a Unicode (UTF-8) response by default.


Vladimir Alexiev

unread,
Apr 4, 2016, 12:38:00 PM4/4/16
to Getty Vocabularies as Linked Open Data
Thanks for the report! It's a bug. The same holds of XML format. Can be checked with:

https://www.w3.org/TR/sparql11-results-json/ says "the encoding considerations of the SPARQL Query Results JSON Format is identical to those of the "application/json" as specified in [JSON-RFC]",
and http://www.ietf.org/rfc/rfc4627.txt says "JSON text SHALL be encoded in Unicode. The default encoding is UTF-8".

Posted as https://jira.getty.edu/browse/ITSLOD-460 (it's a closed jira).

Is it only the misleading heading, or have you also noticed characters come out wrong?

Christopher Johnson

unread,
Apr 4, 2016, 5:34:08 PM4/4/16
to Getty Vocabularies as Linked Open Data
Thank you for the follow up.  Yes, I suspect that the characters are also not returned in Unicode.  The following run from a REST client:
{
  "head" : {
    "vars" : [ "obj" ]
  },
  "results" : {
    "bindings" : [ {
      "obj" : {
        "type" : "literal",
        "value" : "Göttingen"
      }
    } ]
  }
}

Vladimir Alexiev

unread,
Apr 5, 2016, 1:44:39 AM4/5/16
to Getty Vocabularies as Linked Open Data
whenI open this as UTF-8 file, I see         "value" : "Göttingen" (i.e. ok)

Christopher Johnson

unread,
Apr 5, 2016, 5:14:01 AM4/5/16
to Getty Vocabularies as Linked Open Data
Yes, I think that if the request is through a browser, the ISO-8859 charset is typically rendered correctly in the response stream.  However, the raw response charset is definitely not Unicode (as can only be seen from a REST client).  For XHR (my use case), a JSON response should be UTF-8 or else I see garbage for special characters. One can alternatively workaround the issue with encodeURIComponent(), though this feels very inefficient as it reencodes the response charset as an escaped UTF8 sequence that then has to be unescaped before rendering.   

Vladimir Alexiev

unread,
Apr 9, 2016, 1:43:08 PM4/9/16
to Getty Vocabularies as Linked Open Data
It looks to me that the body is correct UTF8, and only the header is wrong?

Christopher Johnson

unread,
Apr 10, 2016, 3:19:25 AM4/10/16
to Getty Vocabularies as Linked Open Data
I agree.  The problem is with the webserver.  Possibly, the Apache conf file just may just require the directive "AddDefaultCharset utf-8" (the default Apache charset is ISO-8859-1).

Rolf Blijleven

unread,
Apr 4, 2018, 11:15:55 AM4/4/18
to Getty Vocabularies as Linked Open Data
Hi Vladimir, 

Please CMIIW, but I think I ran into this very same problem, see here

Would there be any chance of getting this fixed at the server, do you think? 

Cheers, 
Rolf

Op zaterdag 9 april 2016 19:43:08 UTC+2 schreef Vladimir Alexiev:

Vladimir Alexiev

unread,
Apr 5, 2018, 9:29:20 AM4/5/18
to Getty Vocabularies as Linked Open Data
Rolf, thanks for reminding us again! The issue was fixed 10m ago but not yet deployed :-( I've reopened the issue and hope it will be deployed very soon.

Vladimir Alexiev

unread,
Apr 11, 2018, 6:38:39 AM4/11/18
to Getty Vocabularies as Linked Open Data
Gregg Garcia reports: "The new Forest build has been deployed. The sparql results now show the correct charset."

It works both when you request by endpoint extension or by content type:

Content-Disposition: attachment; filename="sparql.xml"
Content-Type: application/sparql-results+xml;charset=UTF-8

Content-Disposition: attachment; filename="sparql.xml"
Content-Type: application/sparql-results+xml;charset=UTF-8

>curl -I -Haccept:application/sparql-results+json "http://vocab.getty.edu/sparql?query=select*%7B%3Fs+%3Fp+%3Fo%7Dlimit+2"
Content-Disposition: attachment; filename="sparql.json"
Content-Type: application/sparql-results+json;charset=UTF-8

>curl -I -Haccept:application/sparql-results+xml "http://vocab.getty.edu/sparql?query=select*%7B%3Fs+%3Fp+%3Fo%7Dlimit+2"
Content-Disposition: attachment; filename="sparql.xml"
Content-Type: application/sparql-results+xml;charset=UTF-8

Content-Disposition: attachment; filename="sparql.tsv"
Content-Type: text/tab-separated-values;charset=utf-8

Content-Disposition: attachment; filename="sparql.tsv"
Content-Type: text/tab-separated-values;charset=utf-8

Content-Disposition: attachment; filename="sparql.csv"
Content-Type: text/csv;charset=utf-8

Content-Disposition: attachment; filename="sparql.csv"
Content-Type: text/csv;charset=utf-8


Reply all
Reply to author
Forward
0 new messages