(can this be done with Jetty?)
2010/8/10 limux <liumen...@gmail.com>:
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
--
Communication is essential. So we need decent tools when communication
is lacking, when language capability is hard to acquire...
- http://esperanto.net - http://esperanto-jongeren.nl
Linux-user #496644 (http://counter.li.org) - first touch of linux in 2004
I spoke to him on #clojure and from what I could tell from some
experiments I asked him to run:
(map int "刘孟江") -> (21016 23391 27743)
=> no source file encoding issues
(.name (java.nio.charset.Charset/defaultCharset)) -> "GBK"
=> his OS default encoding is GBK ("GBK is an extension of the
GB2312 character set for simplified Chinese characters, used in the
People's Republic of China.")
some libs might erroneously rely on that the OS default
encoding is UTF-8 or something else
(defroutes app (GET "/" [] (java.io.ByteArrayInputStream. (.getBytes
"<html><head><meta http-equiv='Content-Type' content='text/html;
charset=UTF-8'></head><body>刘孟江</body></html>" "UTF-8")))) -> showed
up correctly
=> works, since we do the encoding ourselves
(defroutes app (GET "/" [] "<html><head><meta
http-equiv='Content-Type' content='text/html;
charset=UTF-8'></head><body>刘孟江</body></html>")) -> showed up
correctly
=> ring uses UTF-8 as the default encoding no matter what the OS
default is. a very reasonable behavior, since then the result is
always deterministic.
This leaves me to the conclusion that the error is caused by hiccup
somehow (which he also used), since everything seems to work fine
without it. I might look into this later this evening to see if I can
reproduce the error that occurred for him.
// raek, your encoding wizard
When playing around with the repl, I was reminded that JLine (used by
lein repl) does not support multibyte encodings (including UTF-8 and
GBK). Could this be the problem, Limux?
// raek
2010/8/10 Rasmus Svensson <ra...@lysator.liu.se>:
The solution you mention is some middleware that sets the content-type
charset header to a specific value.
Has this fixed the issue? I was under the impression from Rasmus's
post that raw strings worked fine, and it was just an issue with
Hiccup.
However, that in itself is odd, as Hiccup only uses raw strings and
the str function to join them together. I believe this should maintain
the correct string encoding. So assuming both str and literal strings
work, Hiccup should work.
I guess we need to determine whether the string itself has the wrong
encoding, or whether an incorrect encoding has been specified in the
content type.
- James
From what I can tell, the problem he had was caused by compojure's
default content type "text/html" being replaced by "text/html;
charset=iso8859-1". If he added the charset attribute (with the
middleware proposed in the link) the problem went away.
I asked him to check the page info in firefox to see what content-type
the web server served. Without the middleware, it was "text/html;
charset=iso8859-1" and with it, it was "text/html; charset=utf-8", as
expected. The only value existing in the compojure code is
"text/html", iirc.
It appears that Jetty rewrites any text/html content type it serves
and adds a charset attribute (maybe with the dreaded "OS default
charset" as its value) if there isn't one.
Time for some tests, maybe?
// Rasmus
Well, the charset could potentially be set to the default encoding of
the JVM, but that might produce inconsistent results. If you develop
of a JVM with a default encoding of X, but your production machine has
a default encoding of Y, you'll run into problems.
Another option is to have a default charset, such as UTF-8. I don't
think Ring should have a default charset, because it's too "low
level". But Compojure could be set up with a default charset. However,
this won't help people who, say, use Shift-JIS.
I think it would be worth adding some charset setting middleware to
Ring, though, and perhaps document this behaviour. Github has new
Wikis that I'd like to try out :)
- James
> I think it would be worth adding some charset setting middleware to
> Ring, though, and perhaps document this behaviour.
+1 -- character encoding is exactly the kind of thing one would want to set up application-wide.
-Steve