Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Create a file with "Japanese(Shift-JIS)" encoding

5,021 views
Skip to first unread message

vt

unread,
Oct 31, 2002, 2:03:25 PM10/31/02
to
Hello,

I need to create a file with "Japanese(Shift-JIS)" in .NET. By default, i
am getting a UTF-8 encoding.
How do i do that? Unfortunately, help is not clear.

Regards

Kevin Peck

unread,
Oct 31, 2002, 2:42:37 PM10/31/02
to
What API calls are you using now?

"vt" <ev...@hotmail.com> wrote in news:O6wI6dQgCHA.1688@tkmsftngp09:

Willem Kokke

unread,
Oct 31, 2002, 3:01:57 PM10/31/02
to
I never did any of this, it's all just theoretic knowledge by reading the
docs:

using System.IO;
using System.Text;

// you can use

Encoding enc = Encoding.GetEncoding(932);

// or

Encoding enc = Encoding.GetEncoding("shift-jis");

// to get a an encoder for your codepage

string path = @"c:\test"
StreamWriter writer = new StreamWriter(path,false,enc);


That should be it

HTH,
Willem

"Kevin Peck" <kev_w_...@spamyahoo.com> wrote in message
news:Xns92B88B9761E0B...@207.46.239.39...

Chad Myers

unread,
Oct 31, 2002, 3:11:20 PM10/31/02
to

"vt" <ev...@hotmail.com> wrote in message
news:O6wI6dQgCHA.1688@tkmsftngp09...

Well, it is clear, but you're just not looking in the right places :)

.NET only supports the following file encodings:

ASCII
UTF-7
UTF-8
Unicode
Big Endian Unicode

I don't know what Japanese (Shift-JIS) is, but if it doesn't
fall into one of those, then you'll have to write your own
encoder/decoder by overriding:

System.Text.Encoding

-c


Don Dumitru

unread,
Oct 31, 2002, 3:54:21 PM10/31/02
to
(Lots of cross-posted groups snipped...)

I don't think that's entirely accurate. You should be able to get an
encoder for an arbitrary codepage (assuming that the codepage is installed
on the system) by calling System.Text.Encoding.GetEncoding(number); where
number is the number of the codepage you want.

Shift-JIS appears to be codepage 932. Under just what conditions it
actually gets installed on any particular machines, I don't know. (I would
hazard a guess that it has to do with what version of Internet Explorer is
installed on the machine, since that's where most of the MLANG support seems
to get packaged.)

--Don


--
This posting is provided "AS IS" with no warranties, and confers no rights.

"Chad Myers" <cmy...@N0.SP.4M.austin.rr.com> wrote in message
news:I7gw9.239387$8o3.7...@twister.austin.rr.com...

Chad Myers

unread,
Oct 31, 2002, 3:58:50 PM10/31/02
to
I didn't realize that Shift-JIS was a codepage on top of Unicode.

I thought it was a seperate encoding altogether.

Codepages are for Unicode, so he'd be using the Unicode encoding
with a specific code page for translation.

-c

"Don Dumitru" <do...@online.microsoft.com> wrote in message
news:Ol07B#RgCHA.1648@tkmsftngp10...

Till Meyer

unread,
Oct 31, 2002, 4:33:32 PM10/31/02
to
"Willem Kokke" <w.k...@dion-software.com> wrote in
news:OG5AxeRgCHA.1516@tkmsftngp09:

> Encoding enc = Encoding.GetEncoding(932);
> // or
> Encoding enc = Encoding.GetEncoding("shift-jis");

AFAIK the name is "iso-2022-jp".

Till Meyer

Willem Kokke

unread,
Oct 31, 2002, 4:47:42 PM10/31/02
to
I'm not blaming you at all

I've just spend an interesting half hour at www.unicode.org

And what a a monster it is ;-)

Willem

"Chad Myers" <cmy...@N0.SP.4M.austin.rr.com> wrote in message

news:eQgw9.239615$8o3.7...@twister.austin.rr.com...

Willem Kokke

unread,
Oct 31, 2002, 5:19:50 PM10/31/02
to
oops, could very well be, i'm no expert in unicode :-)


"Till Meyer" <Till...@aroh.de> wrote in message
news:apsb3s...@user.aroh.de...

Andreas Suurkuusk

unread,
Nov 1, 2002, 3:27:46 AM11/1/02
to
What about Encoding.GetEncoding( int codepage ).

The documentation even uses Japanese Shift-JIS as an example.

From the documentation:
"A specific code page might not be supported by certain platforms. For
example, the Japanese shift-jis code page (code page 932) might not be
supported in the United States version of Windows 98. In that case, the
GetEncoding method throws NotSupportedException when the following C# code
is executed:

Encoding enc = Encoding.GetEncoding(932); "

Hopefully it will work on a Japanese OS.

Andreas Suurkuusk (and...@scitech.se)
SciTech Software AB

"Chad Myers" <cmy...@N0.SP.4M.austin.rr.com> skrev i meddelandet
news:I7gw9.239387$8o3.7...@twister.austin.rr.com...

vt

unread,
Nov 1, 2002, 12:33:54 PM11/1/02
to

kevin,

I am using vb.net and 'system.text' namespace.

Dim fl As File

I am using f1.createtext to create a file. It creates in UTF-8 format

I am trying to also use f1.create , but i am not sure how to set the file
encoding set to Shift_Jis

I have been trying for sometime but end doesn't seem to be near.

regards

"Kevin Peck" <kev_w_...@spamyahoo.com> wrote in message
news:Xns92B88B9761E0B...@207.46.239.39...

Chad Myers

unread,
Nov 1, 2002, 11:39:39 AM11/1/02
to
Use the StreamWriter class.

using( StreamWriter writer = new StreamWriter(
@"c:\foo.txt", false,
Encoding.GetEncoding("shift-jis")) )
{

writer.WriteLine("(japanese stuff here)");
}

-c


"vt" <ev...@hotmail.com> wrote in message

news:uN#CkQcgCHA.1760@tkmsftngp12...

Chad Myers

unread,
Nov 1, 2002, 11:41:37 AM11/1/02
to
One other note:

If that encoding isn't present on the system,
(Shift-JIS isn't present on US systems by default),
you'll get a NotSupportedException from your call
to Encoding.GetEncoding().

You may or may not want to trap that specifically
and handle it appropriately.

-c

"Chad Myers" <cmy...@N0.SP.4M.austin.rr.com> wrote in message

news:f7yw9.257062$121.7...@twister.austin.rr.com...

vt

unread,
Nov 1, 2002, 12:42:48 PM11/1/02
to
Thanks for the response.

yes, I used Encoding enc = Encoding.GetEncoding(932) to get the encoding,
but

how do i set that to a file? or a streamwriter ? There is an encoding
property there, but it is readonly.

"Andreas Suurkuusk" <And...@thermometric.se> wrote in message
news:upvbeDYgCHA.1848@tkmsftngp10...

Chad Myers

unread,
Nov 1, 2002, 12:08:29 PM11/1/02
to
StreamWriter takes the encoding in its constructor.

There are several constructors for StreamWriter and 2 or 3 of
them take the encoding.

For example:

StreamWriter(string path, bool append, Encoding encoding)

-c

"vt" <ev...@hotmail.com> wrote in message

news:uLLuiVcgCHA.1960@tkmsftngp08...

vt

unread,
Nov 1, 2002, 1:32:58 PM11/1/02
to
Thanks for the help ...i missed out the overloaded method on streamwriter
(sw).

I opened the streamwriter , as discussed below, but when i pass a string
(the string has some Katakana characters)

to sw.writeln(str) and close the stream writer, it doesn't still create a
file with JIS encoding.

I am sure about that because when i open the doc, it does'nt ask me about
encoding type.

Is sw.writeln doing something strange here?

"Willem Kokke" <w.k...@dion-software.com> wrote in message
news:#LkvzrSgCHA.2308@tkmsftngp12...

vt

unread,
Nov 1, 2002, 1:32:19 PM11/1/02
to
Thanks for the help ...i missed out the overloaded method on streamwriter
(sw).

I opened the streamwriter , as discussed below, but when i pass a string
(the string has some Katakana characters)

to sw.writeln(str) and close the stream writer, it doesn't still create a
file with JIS encoding.

I am sure about that because when i open the doc, it does'nt ask me about
encoding type.

Is sw.writeln doing something strange here?

"Chad Myers" <cmy...@N0.SP.4M.austin.rr.com> wrote in message
news:59yw9.257065$121.7...@twister.austin.rr.com...

Chad Myers

unread,
Nov 1, 2002, 1:02:07 PM11/1/02
to
Write the file like that, and then read the file back in
with StreamReader (don't specify an encoding and it should
determine it automagically).

Then print out the Encoding property of the StreamReader
and see what it says. It should say something like 932 or
"Shift-JIS". If it says anything else, then something else
is wrong.

Let us know what it says.

-c

"vt" <ev...@hotmail.com> wrote in message

news:#LoekxcgCHA.2436@tkmsftngp10...

vt

unread,
Nov 1, 2002, 3:28:54 PM11/1/02
to

Dim filename as string = "C:\JPY " & DateString & ".txt"
Dim sw As StreamWriter = New StreamWriter(filename, False,
System.Text.Encoding.GetEncoding(932))
MessageBox(sw.Encoding.EncodingName()) => RETURNS SHIFT-JIS

sw.writeln(str) => STR is a english + katakana string (got from
database having nchars column)

MessageBox(sw.Encoding.EncodingName()) => RETURNS SHIFT-JIS
sw.Close()

Dim swr As StreamReader = New StreamReader(filename)
swr.ReadLine()
MessageBox(swr.CurrentEncoding.EncodingName()) => RETURNS UTF-8

It looks like the file i saved, is not saved as SHIFT-JIS or the katakana
language is messing my encoding ..
Not sure where the issue is ..


"Chad Myers" <cmy...@N0.SP.4M.austin.rr.com> wrote in message

news:zkzw9.257092$121.7...@twister.austin.rr.com...

Chad Myers

unread,
Nov 1, 2002, 2:46:05 PM11/1/02
to
One more test,

When you open the StreamReader, use this constructor:

StreamReader(string filePath, Encoding encoding)

When you read it, specify the 932 encoding and see if you
get better results.

-c

"vt" <ev...@hotmail.com> wrote in message

news:ef8nWydgCHA.1640@tkmsftngp10...

vt

unread,
Nov 1, 2002, 4:01:18 PM11/1/02
to
I tried that, however its get me a default of "UTF-8" or if i specify the
encoding, i get the Shift-JIS. The file is still messed up.

Here is where the issue lies i think ..

1. To save a japanese string in the database, we need to you unicode
(nvarchar, nchar).
2. When We retrieve the unicode string and insert it in a streamwriter with
Shift-JIS encoding, it may not work correctly.

I think we need to convert the unicode string into a Shift-Jis string and
then insert into the file ..

Not sure though .


"Chad Myers" <cmy...@N0.SP.4M.austin.rr.com> wrote in message

news:1SAw9.257111$121.7...@twister.austin.rr.com...

Chad Myers

unread,
Nov 1, 2002, 3:09:43 PM11/1/02
to
Well, .NET strings are unicode, and codepages are just
views of Unicode, so it shouldn't matter one way or the
other.

My guess is that StreamReader/Writer is/are screwed up
somehow.

-c

"vt" <ev...@hotmail.com> wrote in message

news:eiHWdEegCHA.2364@tkmsftngp08...

Till Meyer

unread,
Nov 2, 2002, 2:28:50 PM11/2/02
to
"vt" <ev...@hotmail.com> wrote in
news:eiHWdEegCHA.2364@tkmsftngp08:

> I tried that, however its get me a default of "UTF-8" or if i specify
> the encoding, i get the Shift-JIS. The file is still messed up.

I wrote
string hiragana = "\u3041\u3042\u3043";
string katakana = "\u30a1\u30a2\u30a3";
to a file using Japanese(Shift-JIS) encoding and made a hex dumb
of the file.
The result was :
82 9F 82 A0 82 A1 0D 0A 83 40 83 41 83 42 0D 0A

And there were no problems reading the strings back from the file.

( If you use utf8 for encoding you get :
EF BB BF E3 81 81 E3 81 82 E3 81 83 0D 0A E3 82
A1 E3 82 A2 E3 82 A3 0D 0A
)

This is the code I used to write the strings in the file and read
them back :

Encoding encJ = Encoding.GetEncoding(932);

FileStream fs = new FileStream("japan.txt", FileMode.Create);
StreamWriter sw = new StreamWriter(fs, encJ);
sw.WriteLine(hiragana);
sw.WriteLine(katakana);
sw.Close();
fs.Close();

StringBuilder sb = new StringBuilder();
sb.AppendFormat("{0}\n{1}\n",hiragana, katakana);

sb.Append("From file :\n");
fs = new FileStream("japan.txt", FileMode.Open);
StreamReader sr = new StreamReader(fs, encJ);
string line;
while((line = sr.ReadLine())!=null) sb.Append(line + "\n");
sr.Close();
fs.Close();

I viewed 'sb.ToString()' in a richedit text box. The results
were as expected.

Till Meyer

vt

unread,
Nov 3, 2002, 6:52:55 PM11/3/02
to
Till,

I tried the code you wrote below and created the japan.txt file.

My question,
When the file is in Shift-JIS format, and if we open it in MS Word 2K / Word
Pad, does it throw
a window saying that "This file has Shift-Jis encoding blah , blah ... ?".
( i think it does)

When i open Japan.txt, i do not get that encoding type window (does it means
MS Word is treating it as ascii ?).

Can you verify and let me know if you got that window?

Or am i not correct in assuming that ? (When i created UTF-8 files and tried
opening with word, i got that window)

Regards,
VT


"Till Meyer" <Till...@aroh.de> wrote in message

news:aq1ci1...@user.aroh.de...

Till Meyer

unread,
Nov 4, 2002, 9:43:44 AM11/4/02
to
"vt" <ev...@hotmail.com> wrote in news:uLcOut4gCHA.1300@tkmsftngp08:

> I tried the code you wrote below and created the japan.txt file.

> My question,
> When the file is in Shift-JIS format, and if we open it in
> MS Word 2K / Word Pad, does it throw a window saying that
> "This file has Shift-Jis encoding blah , blah ... ?".
> ( i think it does)

Are you sure? I have no original file, so I can't test this.

> When i open Japan.txt, i do not get that encoding type window
> (does it means MS Word is treating it as ascii ?).

There is no encoding information stored in the file. So it is
handled like every other txt file with the windows.default
codepage.

> Can you verify and let me know if you got that window?

I don't get that window, but I don't have expected one.

> Or am i not correct in assuming that ? (When i created
> UTF-8 files and tried opening with word, i got that window)

When you use a utf-8 coded file (or utf-7 or utf-16), the first
bytes in the file indicate the encoding format. This information
is automatically written to the file by the StreamWriter class.

Because of this information, every editor you are opening the
file with, knows the right encoding format(and a window pops up).

But this class doesn't write any encoding information to the
file if shift-jis is used. I don't know if this is right or not.
You have to look at a original shift-jis encoded file, to decide
if header bytes are used (needed?).

Till Meyer

vt

unread,
Nov 4, 2002, 11:43:52 AM11/4/02
to
Yes. Positive. I have a sample .txt file (in Shift-Jis) and it throws up
that window.

I think you are correct about the initial byte. StreamWriter class writes if
it is utf-8 encoding but not Shift-Jis.

Maybe if i use FileStream class and write the initial byte myself, it will
interpret the file as Shift-Jis. I am not sure if this will work though.

(Please let me know if you need a smaple Shift-Jis for testing. I can email
it)

"Till Meyer" <Till...@aroh.de> wrote in message

news:aq64j...@user.aroh.de...

Mihai N.

unread,
Nov 5, 2002, 4:23:04 AM11/5/02
to
>> My question,
>> When the file is in Shift-JIS format, and if we open it in
>> MS Word 2K / Word Pad, does it throw a window saying that
>> "This file has Shift-Jis encoding blah , blah ... ?".
>> ( i think it does)
For Word 2K you should go to "Tools | Options" in the "General" tab and
check "confirm conversion at Open". When asked about the file type, select
"Encoded Text File" and you will get the dialog about encoding.
Otherwise the file is treated as a text file using the default system
encoding (1252 if you are on a US system).

Mihai

vivek kumar

unread,
Feb 4, 2021, 2:21:07 AM2/4/21
to
the dll 'System.Text.Encoding.CodePages.dll' can be downloaded from Microsoft.
then add add the below codes, it shall work.

Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
Encoding enc = Encoding.GetEncoding("Shift-JIS");

Thanks,
Vivek

😉 Good Guy 😉

unread,
Feb 4, 2021, 9:27:46 AM2/4/21
to

Vivek,

You responded to a 2002 post and Mihai might not even be here or he may have died of Chinese virus.

How is India doing as far as Corona Virus is concerned.  Is Modi doing anything constructive or is he occupied with the farmers' dispute?


--

With over 1.2 billion devices now running Windows 10, customer satisfaction is higher than any previous version of windows.

0 new messages