Problem with response encoding

6,189 views
Skip to first unread message

Tartos

unread,
Jan 19, 2012, 12:06:06 PM1/19/12
to spray-user
Hi,

It seems the complete method of RequestContext re-encode the string I
want to send back. My string is encoded in utf-8 but contains french
accents. When I look at the result in my web browser, it tells me the
encoding is ISO-8859-1. How can I force the complete method to send
back an UTF-8 string ? Thank you.

Regards,
Nicolas

Chris Carrier

unread,
Jan 19, 2012, 12:44:58 PM1/19/12
to spray...@googlegroups.com
What encoding is your browser set to? Have you tried calling from
curl with the proper accept header? Maybe try something like:

curl -vvv -H "Accept: text/html;charset=UTF-8" <YOUR_URL>

Assuming the response is html. That will tell you what encoding Spray
is returning in the header. I haven't tried different encoding but so
far Spray has honored the accept header for different content types.

Chris

Nicolas Bonnel

unread,
Jan 19, 2012, 1:09:27 PM1/19/12
to spray...@googlegroups.com
curl -vvv -H "Accept: text/plain;charset=UTF-8"  MyUrl > tmpFile

produces a file that I can't open with utf-8 encoding. But opening it with iso-8859-1 works.

curl -vvv -H "Accept: text/plain;charset=ISO-8859-1"  MyUrl > tmpFile

gives the same result.

My web server runs with debian squeeze, using utf-8 locales, and my workstation ubuntu 11.04, also with utf-8 locales.

I browsed some spray source files, and saw a getOrElse("iso-8859-1") somewhere in the defaultmarshmaller. Could my string be automatically encoded in iso-8859-1 by the default marshmaller ?

Nicolas

2012/1/19 Chris Carrier <ctca...@gmail.com>

Mathias

unread,
Jan 20, 2012, 3:26:28 AM1/20/12
to spray...@googlegroups.com
Nicolas & Chris,

the `Accept` header does not define a `charset` parameter for a media type. The latter is used with the `Content-Type` header (which also contains a media type but describes the content of the current message not the requirements for a response).

What you are looking for is the `Accept-Charset` header:

curl -vvv -H "Accept-Charset: utf-8" MyUrl > tmpFile

spray honors the client requirements here (even though quality values are not yet supported).

Cheers,
Mathias

---
mat...@spray.cc
http://www.spray.cc

Nicolas Bonnel

unread,
Jan 20, 2012, 4:09:56 AM1/20/12
to spray...@googlegroups.com
Mathias,

curl -vvv -H "Accept-Charset: utf-8" MyUrl > tmpFile

works correctly.

curl -vvv MyUrl > tmpFile

does not work, although both my webserver and workstation use utf-8 charset.

Is it possible to make my webserver complete requests using utf-8 as default charset ?

Regards,
Nicolas

2012/1/20 Mathias <mat...@spray.cc>

Mathias

unread,
Jan 20, 2012, 5:22:04 AM1/20/12
to spray...@googlegroups.com
Nicolas,

the HTTP spec states: "If no Accept-Charset header is present, the default is that any character set is acceptable."
Since the default charset for HTTP is ISO-8859-1 spray resorts to ISO-8859-1 if the client does not request something else.

Of course you can still produce a response using UTF-8, e.g. via one of the following ways:

- Respond with an explicit HttpContent:

ctx.complete(HttpContent(ContentType(`text/plain`, `UTF-8`), "Anaïs et le garçon"))


- Transform the inner response:

respondWithCharset(`UTF-8`) {
completeWith("Anaïs et le garçon")
}

def respondWithCharset(charset: HttpCharset) = transformResponse {
_.withContentTransformed { content =>
HttpContent(ContentType(`text/plain`, charset), content.as[String].right.get)
}
}

- Use a custom StringMarshaller

HTH and cheers,
Mathias

---
mat...@spray.cc
http://www.spray.cc

Nicolas Bonnel

unread,
Jan 20, 2012, 5:50:56 AM1/20/12
to spray...@googlegroups.com
Mathias,

I didn't know the default charset for HTTP is ISO-8859-1, I would have bet on unicode or ascii !!

ctx.complete(HttpContent(ContentType(MediaTypes.`text/plain`, HttpCharsets.`UTF-8`), "Anaïs et le garçon")) 

did the trick, thanks for the fast and precise answer !

Cheers,
Nicolas

2012/1/20 Mathias <mat...@spray.cc>
Reply all
Reply to author
Forward
0 new messages