Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to compile source files encoding in utf-8 with eclipse?

1,022 views
Skip to first unread message

parmenides

unread,
May 12, 2017, 6:58:57 AM5/12/17
to
Hi all,

I am developing a tiny JSP app, and encounter a strange problem. I use
Eclipse as IDE and want to use utf-8 to encode everything such as source
files, JSP pages and class files. There are serveral settings I have done:
1. Windows > Preference > General > Workspace > Text file encoding > utf-8.
2. Windows > Preference > General > Content Types > Text > Java Source
File > utf-8.
3. Windows > Preference > General > Content Types > Java Class File > utf-8.
Then, I restarted Eclipse.

The server is Tomcat 9.0. A Servlet is created with its doGet() code:

protected void doGet(HttpServletRequest request,
HttpServletResponse response)
throws ServletException, IOException {
response.setContentType("text/html;charset=utf-8");
response.getWriter().write("****");
}
ps. * represents a chinese character.

When runinng on the server (starting from Eclipse), there some '?' on
the page. I think there are some troubles about encoding. Then, I
recompiled the source file on the command line with the following command:

javac -encoding utf-8 filename -classpath ...

and copied the generated class file into tomcat's app directory. The
page display the chniese characters normally. I think the difference is
at the encoding of class file. I can change that by command line with
-encoding option, but how to do that with Eclipse? Or there something I
have not been aware of.

Marcel Mueller

unread,
May 12, 2017, 9:05:20 AM5/12/17
to
On 12.05.17 12.58, parmenides wrote:
> javac -encoding utf-8 filename -classpath ...
>
> and copied the generated class file into tomcat's app directory. The
> page display the chniese characters normally. I think the difference is
> at the encoding of class file.

I think there is nothing like 'encoding' in a class file. Java string
are always UTF-16 aka UCS-2.

But of course, javac need to be able to read your string literals while
compiling. That's the problem. You need to pass this option to the javac
compiler too. AFAIK javac uses the systems (or users) default setting.
But if you override this in eclipse you need to override it for javac too.

> I can change that by command line with
> -encoding option, but how to do that with Eclipse? Or there something I
> have not been aware of.

This is probably a setting of the Java builder in the project settings.
(I just can't check because I have currently only CDT installed.)

However, you may also set the environment for the entire eclipse IDE by
setting the environment variable LANG appropriately before the eclipse
start. But keep in mind that this also applies to the file names in Unix
like systems. So if your file names or paths contain special characters
they may no longer display correctly after this change. Windows handles
file names differently. They are always converted form or to UTF-16.

Another option that might work: precede all you files with a BOM marker
as first UTF-8 character. Most applications will auto-detect them as
UTF-8 afterwards. Maybe also javac. Some application hav an extra
setting for this purpose: "UTF-8 with BOM" or something like that.


Marcel
0 new messages