Is it doable? I tried to take a look at it with Word but it didn't
gave anything fun (i saw the very same characters).
--
Kindly
Konrad
---------------------------------------------------
May all spammers die an agonizing death; have no burial places;
their souls be chased by demons in Gehenna from one room to
another for all eternity and more.
Sleep - thing used by ineffective people
as a substitute for coffee
Ambition - a poor excuse for not having
enough sense to be lazy
---------------------------------------------------
I think the format above is still preserving the double-byte nature of the
original characters, the viewer just doesn't know that they are intended to
be double byte characters as opposed to single byte characters.
In Word, go to the General tab of Options and click the checkbox that says
"confirm conversion at open." Then, when you double click on a text file, it
will give you a dialog where you can open it in various formats. When you
open your file, try opening it as encoded text. It will then give you a
second dialog where you choose the encoding to open it with. It may
recognize that there are Japanese characters and default to Japanese. If
not, choose it from the menu and hit OK. This may enable you to open the
file and actually view the Japanese.
Jeff
> I have a text file (sort of) where every row contains three columns
> that are: 1. Kanji 2. Kana-reading 3. English meaning.
> I'd like to be able to read (and write to) the file but it's encoded
> in a format that looks like:
> 外|ã ã ¨|outside
> 外出㠙る|㠌㠄・㠗ゅ㠤㠙る|to go out
> 外国|㠌㠄・㠓ã |a foreign country
> 外科|㠒・㠋|surgery
>
> Is it doable? I tried to take a look at it with Word but it didn't
> gave anything fun (i saw the very same characters).
>
I looks like EUC-JP to me. Try giving it a .euc extension and opening it
with JWPCE (available for free download at numerous places).
KWW
Hmmm... doesn't edict use a similar format?
A quote from the documentation that comes with it:
>
> FORMAT
>
> EDICT's format is that of the original "EDICT" format used by the
early PC
> Japanese word-processor MOKE (Mark's Own Kanji Editor). It uses EUC-JP
> coding for kana and kanji, however this can be converted to JIS
> (ISO-2022-JP) or Shift-JIS by any of the several conversion programs
> around.
> It is a text file with one entry per line. The format of entries is:
>
> KANJI [KANA] /English_1/English_2/.../
>
> or
>
> KANA /English_1/.../
>
>> I'd like to be able to read (and write to) the file but it's encoded
>> in a format that looks like:
>> 外|ã ã ¨|outside
>> 外出㠙る|㠌㠄・㠗ゅ㠤㠙る|to go out
>> 外国|㠌㠄・㠓ã |a foreign country
>> 外科|㠒・㠋|surgery
>>
>> Is it doable? I tried to take a look at it with Word but it didn't
>> gave anything fun (i saw the very same characters).
>>
> I looks like EUC-JP to me. Try giving it a .euc extension and opening it
> with JWPCE (available for free download at numerous places).
>
> KWW
>
You might also try, as a quick check, to open it with a web browser and
then try different character encodings. I.e. Mozilla Firebird (which I
use) has "View -> Character Coding -> More -> East Asian -> Japanese
(...)" in the menu; don't know where it is in other browsers.
--
Single line signatures are neat :-)
> I have a text file (sort of) where every row contains three columns
> that are: 1. Kanji 2. Kana-reading 3. English meaning.
> I'd like to be able to read (and write to) the file but it's encoded
> in a format that looks like:
> i"??a*?-|a~??a~?¨|outside
> a*?-a*‡^(o)a~?^(TM)a~,<|a~?OEa~?"a~f?a~?―a~,…a~??a~?^(TM)a~,<|to go out
> a*?-a*>1/2|a~?OEa~?"a~f?a~?“a~??|a foreign country
> a*?-c,§‘|a~?’a~f?a~?<|surgery
>
> Is it doable? I tried to take a look at it with Word but it didn't
> gave anything fun (i saw the very same characters).
>
外 |そと - soto | outside
外出する |がいしゅつする - gaishutsusuru| to go out
外国 |がいこく - gaikoku |a foreign country
外科 |げか - geka | surgery
First attempt sending non-western encoding, hope it works
(If not, someone will have to tell me how to get Thunderbird/mozilla to
send in the right encoding)
hope this works (copying and pasted into JWPce and it showed up fine for
me - slightly edited to remove crap and to add romaji :)
--
Iain
> 外 |そと - soto | outside
> 外出する |がいしゅつする - gaishutsusuru| to go out
> 外国 |がいこく - gaikoku |a foreign country
> 外科 |げか - geka | surgery
>
> First attempt sending non-western encoding, hope it works
> (If not, someone will have to tell me how to get Thunderbird/mozilla to
> send in the right encoding)
> hope this works (copying and pasted into JWPce and it showed up fine for
> me - slightly edited to remove crap and to add romaji :)
The "crap" you removed wasn't crap. These dots were there to mark the
boundaries between the kanji.
Peter
It looks like you got it. I also use Mozilla on Linux and have to
wrestle with multi-lingual configs --- it can be a bit challenging. :)
--
David Nettles
web: http://www.miteyo.org
email: tetsuo...@yahoo.co.jp
> Farrell wrote:
>
>>First attempt sending non-western encoding, hope it works
>>(If not, someone will have to tell me how to get Thunderbird/mozilla to
>>send in the right encoding)
>>hope this works (copying and pasted into JWPce and it showed up fine for
>>me - slightly edited to remove crap and to add romaji :)
>>
>
>
> It looks like you got it. I also use Mozilla on Linux and have to
> wrestle with multi-lingual configs --- it can be a bit challenging. :)
>
especially at 4 o'clock in the morning after a hard night's drinking...
--
Iain
Yeah, I know - but it's not used in writing (that I've seen), and the
separation should be obvious when they're next to each other - hence 'crap'
Probably should have used a different word, though (but 4am...drink...meh.)
--
Iain