Forcing UTF-8 encoding (possibly a Jetty issue?)

2,110 views
Skip to first unread message

Rob

unread,
Nov 9, 2014, 11:05:48 AM11/9/14
to dropwiz...@googlegroups.com
Is there a way to configure Jetty in Dropwizard to use UTF-8 encoding in the output? I have a suspicion Jetty is causing a problem I'm seeing: foreign characters display in the browser as a black diamond with question mark. Additional info: 

I have a method like this:

@GET
@Path("/foo")
@Produces(MediaType.TEXT_HTML)
public View foo() 
{
    // returns a View constructed with a Freemarker template
}

The Freemarker template has some variables inserted in it (e.g. ${someString}). Those variables are rendered as what appears to be ISO-8859-1, NOT as UTF-8, despite: 

* All source files and templates (Freemarker) being UTF-8 encoded
* Maven configured to use UTF-8 (maven-resources-plugin and maven-compiler-plugin) 
* Having set JVM option: -Dfile.encoding=UTF-8 (also tried environment variable LANG=en_US.UTF-8)
* The browser's encoding is set to UTF-8
* The variable text is stored in the DB as UTF-8.
* The variable text retrieved from the DB is correct when inspected in a debugger.

In this case the characters are have umlaut diacritics (i.e. German) and they display in the browser as a black diamond with question mark. 

Further diagnostics:
* I put a character with an umlaut diacritic in the template. Result: character renders correctly.
* I had the Freemarker template output the encoding its using via: ${.output_encoding!"Not set"}. Result: no encoding is set.

I'm sure I'm not the first user of the Dropwizard stack to run into this gotcha, and I'm wondering if I'm overlooking a setting somewhere to ensure UTF-8 encoding is used.

Thanks,

Rob

Rob

unread,
Nov 9, 2014, 4:08:24 PM11/9/14
to dropwiz...@googlegroups.com
Update: 

Tried three more things:

* Upgraded to 0.8-RC1 to get the latest versions of Jetty and Jersey. No difference.
* Changed the Produces annotation on the offending method to: @Produces(MediaType.TEXT_HTML + "; charset=utf-8"). Still not working.
* Added a servlet filter to Jetty to force encoding to UTF-8 for all content  (i.e. in doFilter, just call HttpServletResponse.setCharacterEncoding("UTF-8"). Nope...

I'm officially out of ideas. 

Brad

unread,
Nov 12, 2014, 1:43:30 AM11/12/14
to dropwiz...@googlegroups.com
I think I'm in the same boat on Windows.

I cloned the repo onto my Debian machine. I then did a mvn clean install which forces Dropwizard to be rebuilt and tested. All tests succeeded.

I copied this same directory to my Windows machine and tried the same thing, but the Freemarker view module render test failed for what seems like an encoding issue. The values re visually the same, but it doesn't show encoding. java.lang.String doesn't appear to have a encoding-type attribute...

Brad

Ryan Kennedy

unread,
Nov 12, 2014, 11:47:39 AM11/12/14
to dropwiz...@googlegroups.com
Can you create an issue in github and then throw the link back in this thread? TravisCI doesn't support Windows platforms yet (https://github.com/travis-ci/travis-ci/issues/216) for testing, so we can't verify the problem during testing for regressions. But it would be good to get unit tests around encodings if we don't have them already.

Ryan

--
You received this message because you are subscribed to the Google Groups "dropwizard-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dropwizard-us...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rob

unread,
Nov 14, 2014, 12:02:24 AM11/14/14
to dropwiz...@googlegroups.com
Will do.

My working theory right now is that it's a JVM bug (or "working as designed", depending on your perspective). I don't have the time to do a deep investigation right now, but essentially I suspect that whatever Writer Freemarker is using is an OutputStreamWriter that is hardcoded to use the default system encoding for the JVM for that platform -- i.e. not any encoding set as a environment variable or JVM parameter. 

Ultimately it's a non-issue for me, since I don't use Windows for production systems. 

Johan Wirde

unread,
Nov 17, 2014, 12:58:23 PM11/17/14
to dropwiz...@googlegroups.com
If I remember correctly I had to call this super constructor with UTF-8 charset to get encoding right in my Freemarker views:


/**
* Creates a new view.
*
* @param templateName the name of the template resource
* @param charset the character set for {@code templateName}
*/
protected View(String templateName, Charset charset) {
this.templateName = resolveName(templateName);
this.charset = charset;
}

Rob

unread,
Nov 19, 2014, 11:03:10 PM11/19/14
to dropwiz...@googlegroups.com
What do you know, passing StandardCharsets.UTF_8 to the View constructor fixed the problem.

Mystery solved. Thank you Johan!
Reply all
Reply to author
Forward
0 new messages