Mojibake question

2 views
Skip to first unread message

Peter Clark

unread,
Jul 28, 2008, 4:18:43 AM7/28/08
to Honyaku E<>J translation list
Dear Yakkers,
I am having a problem with mojibake on the web.
It is similar to the situation with Wikipedia where
http://ja.wikipedia.org/wiki/ジャコバイト
and
http://ja.wikipedia.org/wiki/%E3%82%B8%E3%83%A3%E3%82%B3%E3%83%90%E3%82%A4%E3%83%88
lead you to the same page. What is the issue that leads to the %EF
type characters appearing?

Peter Clark

Alfred S Chamass

unread,
Jul 28, 2008, 4:51:27 AM7/28/08
to hon...@googlegroups.com


2008/7/28 Peter Clark <petercl...@hotmail.com>

Changing the character encoding to Unicode (UTF-8) works fine.

Regards
--
Alfred Salib Chamass
schamass@gmail

JimBreen

unread,
Jul 28, 2008, 5:16:32 AM7/28/08
to Honyaku E<>J translation list
On Jul 28, 6:18 pm, Peter Clark <peterclarkat...@hotmail.com> wrote:
> I am having a problem with mojibake on the web.
> It is similar to the situation with Wikipedia wherehttp://ja.wikipedia.org/wiki/ジャコバイト
> andhttp://ja.wikipedia.org/wiki/%E3%82%B8%E3%83%A3%E3%82%B3%E3%83%90%E3%...
> lead you to the same page. What is the issue that leads to the %EF
> type characters appearing?

Those %E3%82%B8.... are not mojibake. They are UTF-8 encoded
characters in the standard "URL-escape" format that is specified
for use in URLs and within the HTTP protocol when dealing with non-
ASCII
characters. Some browsers convert them so you see ジャコバイト, and others
(e.g. Firefox) leave them encoded. Eiher way it should work OK.

HTH

Jim

Peter Clark

unread,
Jul 28, 2008, 6:59:57 AM7/28/08
to hon...@googlegroups.com
Dear Jim,
Thanks for the lead.
I have a page that accepts, reads and displays Japanese just fine. It then converts the Japanese to the UTF-8 characters, then tries to convert them back, giving me mojibake such as ƒW��ƒRƒoƒCƒg. No good at all.
 
Any further leads appreciated.
 
Peter Clark



> From: jimb...@gmail.com
> Those %E3%82%B8.... are not mojibake. They are UTF-8 encoded
> characters in the standard "URL-escape" format that is specified
> for use in URLs and within the HTTP protocol when dealing with non-
> ASCII
> characters. Some browsers convert them so you see ジャコバイト, and others
> (e.g. Firefox) leave them encoded. Eiher way it should work OK.
>
>



Find out: SEEK Salary Centre Are you paid what you're worth?

Jeroen Ruigrok van der Werven

unread,
Jul 28, 2008, 7:05:35 AM7/28/08
to hon...@googlegroups.com
-On [20080728 13:00], Peter Clark (petercl...@hotmail.com) wrote:
>I have a page that accepts, reads and displays Japanese just fine. It then
>converts the Japanese to the UTF-8 characters, then tries to convert them back,
>giving me mojibake such as ƒW��ƒRƒoƒCƒg. No good at all.

What does the encoding say the page is?
Another thing might be the server's default encoding being forced.

>> From: jimb...@gmail.com
>> Some browsers convert them so you see ジャコバイト, and others
>> (e.g. Firefox) leave them encoded. Eiher way it should work OK.

Actually, Firefox 3 and Opera 9.5 are showing the Japanese and not the
url_encoded values.

--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
In this short time of promise, you're a memory...

Peter Clark

unread,
Jul 28, 2008, 7:13:02 AM7/28/08
to hon...@googlegroups.com
Dear Jeroen,
Charset is given as UTF8, and the only other references to encoding I could find are also UTF8.
I suspect you may be right about the server's encoding. Thanks for the hint.
 
Peter Clark

> Date: Mon, 28 Jul 2008 13:05:35 +0200
> From: asm...@in-nomine.org
> To: hon...@googlegroups.com
> Subject: Re: Mojibake question
Reply all
Reply to author
Forward
0 new messages