HOW TO FIND 1BYTE AND 2BYTE CHARACTER

mokomoji

unread,

Jan 21, 2014, 4:18:03 AM1/21/14

to

Dear you

I'm curious.
I would like to search for. "ASCII 1BYTE" and "2BYTE special character"
How do you recommend?

I made. Sauce
But made. Poorly
I'd greatly appreciate it if you tell a good idea

My source is displayed. 0 correct, 1 wrong

@echo off
set "var=adlk jald★ ◆ asddn▲ asdf♥☎"
for /l %%l in (1,1,28) do (
call set num=%%l
call set varx=%!%var:~%%num%%,1%!%
call set zvar=%%varx%%
call :zasc
)
goto :end
:zasc
call set "zvar1=%%%zvar%%%"
set "asccode=abcdefghijklmnopqrstuvwxyz~!@#$%^*()_+`1234567890-=\][';/.,?:{}"
for /f "usebackq delims=" %%f in (`echo "%asccode%"^|call find /i /c "%zvar1%"`) do (
echo "%zvar1%"=%%f
)
goto :eof
:end
pause

Frank P. Westlake

unread,

Jan 21, 2014, 10:21:22 AM1/21/14

to

On 01/21/2014 01:18 AM, mokomoji wrote:
> I would like to search for. "ASCII 1BYTE" and "2BYTE special character"

I think this needs to be done by someone with a non-US Windows. I think
non-US Windows permits UTF-8 consoles (maybe Windows 8 does too, I
don't know) and that might effect how the script functions.

Frank

mokomoji

unread,

Jan 21, 2014, 10:16:18 PM1/21/14

to

T_T

oTL

Orz

oh my god

Korean and English is the source for identification sub purposes.
The English version of

A special character is used instead of korean.
The English version was used because of 2byte special characters.
It is that special characters are Korean.

@echo off
set "var=adlk jald우리 나라 asddn대한 asdf민국basdf"

mokomoji

unread,

Jan 21, 2014, 10:49:47 PM1/21/14

to

plz
english windows
c:\widows\system32\dir /b *.nls >list.txt

i'am windows xp sp3 korean ver

big5.nls
bopomofo.nls
ctype.nls
c_037.nls
c_10000.nls
c_10001.nls
c_10002.nls
c_10003.nls
c_10006.nls
c_10007.nls
c_10008.nls
c_10010.nls
c_10017.nls
c_10029.nls
c_10079.nls
c_10081.nls
c_10082.nls
c_1026.nls
c_1250.nls
c_1251.nls
c_1252.nls
c_1253.nls
c_1254.nls
c_1255.nls
c_1256.nls
c_1257.nls
c_1258.nls
c_1361.nls
c_20000.nls
c_20127.nls
c_20261.nls
c_20290.nls
c_20866.nls
c_20905.nls
c_20932.nls
c_20936.nls
c_20949.nls
c_21027.nls
c_21866.nls
c_28591.nls
c_28592.nls
c_28593.nls
C_28594.NLS
C_28595.NLS
C_28597.NLS
c_28598.nls
c_28599.nls
c_28603.nls
c_28605.nls
c_437.nls
c_500.nls
c_737.nls
c_775.nls
c_850.nls
c_852.nls
c_855.nls
c_857.nls
c_860.nls
c_861.nls
c_863.nls
c_865.nls
c_866.nls
c_869.nls
c_874.nls
c_875.nls
c_932.nls
c_936.nls
c_949.nls
c_950.nls
geo.nls
ksc.nls
locale.nls
l_except.nls
l_intl.nls
normidna.nls
normnfc.nls
normnfd.nls
normnfkc.nls
normnfkd.nls
prc.nls
prcp.nls
sortkey.nls
sorttbls.nls
unicode.nls
xjis.nls

or

@echo off
if exist chcplist.txt del chcplist.txt

for /l %%j in (1,1,100000) do (
chcp 437 >nul
call :oz %%j
)
goto :end

:oz
echo %~1--------------------
chcp %~1
if "%errorlevel%" equ "0" (
echo chcp %~1 %errorlevel% >> chcplist.txt

)
goto :eof
:end
pause

chcplist.txt test...
plz..

mokomoji

unread,

Jan 21, 2014, 11:14:35 PM1/21/14

to

Please advise me of create law asciidec

foxidrive

unread,

Jan 22, 2014, 12:25:32 AM1/22/14

to

On 22/01/2014 15:14, mokomoji wrote:
> Please advise me of create law asciidec
>

What is law asciidec?

I know what ascii is, but I don't know that term.

frank.w...@gmail.com

unread,

Jan 22, 2014, 5:05:06 AM1/22/14

to

From mokomoji :

>@echo off
>if exist chcplist.txt del chcplist.txt

>for /l %%j in (1,1,100000) do (
>chcp 437 >nul
>call :oz %%j
>)
>goto :end

>:oz
>echo %~1--------------------
>chcp %~1
>if "%errorlevel%" equ "0" (
>echo chcp %~1 %errorlevel% >> chcplist.txt
>)
>goto :eof
>:end
>pause

>chcplist.txt test...
>plz..

He might be asking for someone to run this script and
show the result.

Frank

mokomoji

unread,

Jan 22, 2014, 5:56:06 AM1/22/14

to

asciidec = ascii code a = 64 16 hex to 10 dec code
b = 65
c = 66
d = 67
e = 68
f = 69
g = 70

code number

dec code..
one byte = dec code...
two byte = none code...error
true

but hex
on byte = hex code..one byte
two byte = two hex code two byte
faile...

not use vba...not use debug...
windows xp type...coding..
can u scripting..?

ascii2hex T_Ta nonononono..
ascii2dec ^^ yesssssssssss...

foxidrive

unread,

Jan 22, 2014, 6:13:03 AM1/22/14

to

On 22/01/2014 21:56, mokomoji wrote:
> asciidec = ascii code a = 64 16 hex to 10 dec code
> b = 65
> c = 66
> d = 67
> e = 68
> f = 69
> g = 70
>
> code number

Capital A is 65.

Lower case a is 97

> dec code..
> one byte = dec code...
> two byte = none code...error
> true
>
> but hex
> on byte = hex code..one byte
> two byte = two hex code two byte
> faile...
>
> not use vba...not use debug...
> windows xp type...coding..
> can u scripting..?

I still down follow what you need to do.

You have single byte characters and maybe also Unicode which are double byte characters.

Can you give some examples of what you want converted, and also the answer of what you want for those
examples?

Something like this?

A = 41 hex
...
Z = 5A hex

frank.w...@gmail.com

unread,

Jan 22, 2014, 7:04:02 AM1/22/14

to

From mokomoji :

It looks like he wants a routine which will give the
decimal value of an ASCII character without resourting
to CSCRIPT or DEBUG.

There are a few tricks but I don't think they will give
the range he wants. Does anyone recall those tricks?

It looks like his ultimate desire is to do something
that requires translating from UTF-8 multi-byte
characters to ASCII. I think we did some work on that
here also.

Frank

mokomoji

unread,

Jan 22, 2014, 7:20:38 AM1/22/14

to

If you do not understand.
Korean character code
why?
i'am Not understand Korean code.
why?
korean version windows insert code

949 ks_c_5601-1987 Korean
1200 utf-16 Unicode
1201 unicodeFFFE Unicode (Big-Endian)
1361 Johab Korean (Johab)
10003 x-mac-korean Korean (Mac)
12000 utf-32 Unicode (UTF-32)
12001 utf-32BE Unicode (UTF-32 Big-Endian)
20127 us-ascii US-ASCII
20833 x-EBCDIC-KoreanExtended IBM EBCDIC (Korean Extended)
20949 x-cp20949 Korean Wansung
50225 iso-2022-kr Korean (ISO)
51949 euc-kr Korean (EUC)
65000 utf-7 Unicode (UTF-7)
65001 utf-8 Unicode (UTF-8)

The korean windows system&web code used mainly
949
6500x
51949
&
unicodes

cmd code
949 -> So called in Korea. - MS949 CODE
Is a freak. International standards are not
freak code is used. Microsoft Company

korean mode defualt 949-> 2btye character
ascii code -> 8bit? ks? korea standards ascii code 1byte character

chcp 437 -> ascii code 7bit 1byte character

and..
cmd mode korea version 1byte character 2byte character mix use

endglish 1byte korean 2byte...
not use ascii 7bit code..

mokomoji

unread,

Jan 22, 2014, 7:59:28 AM1/22/14

to

i know..
vbs(csscript) & debug type pocess
vbs type funtions Character output type Transformation
debug type files write to debug read
and find to this website search.
but..

my original..souce code
set "asccode=abcdefghijklmnopqrstuvwxyz~!@#$%^*()_+`1234567890-=\][';/.,?:{}"
This is why I made using it?

debug no use?
csscript no use?

I did not know how to use?
no...;;;
I debug to make a game using the editor.

I want the you think

I make it?
"ASCII 1BYTE" and "2BYTE special character" <-- i make it..;;
I want the you think

"ascii2dec code souce" is not the final source.
Final source of "1byte 2byte" would classify characters.

"ascii2dec" is "2byte" characters so that they are not used
"2byte" is intended to classify the characters.

Frank P. Westlake

unread,

Jan 22, 2014, 8:58:23 AM1/22/14

to

On 01/22/2014 04:59 AM, mokomoji wrote:
> my original..souce code
> set "asccode=abcdefghijklmnopqrstuvwxyz~!@#$%^*()_+`1234567890-=\][';/.,?:{}"

> "ASCII 1BYTE" and "2BYTE special character" <-- i make it..;;

> "ascii2dec code souce" is not the final source.
> Final source of "1byte 2byte" would classify characters.
>
> "ascii2dec" is "2byte" characters so that they are not used
> "2byte" is intended to classify the characters.

I can't tell what you want to convert:

Unicode (2-byte) -> ASCII (1-byte)
Unicode (2-byte) -> Multi-byte (UTF-8, 1-4-byte)
ASCII (1-byte) -> Multi-byte (UTF-8, 1-4-byte)
ASCII (1-byte) -> Unicode (2-byte)
Multi-byte (UTF-8, 1-4-byte) -> ASCII (1-byte)
Multi-byte (UTF-8, 1-4-byte) -> Unicode (2-byte)

Maybe this will help, a script to convert multi-byte to Unicode.
WARNING -- WORD WRAP! There are two spaces before each line of script.

:: BEGIN SCRIPT :::::::::::::::::::::::::::::::::::::
:: UtoUTF8.cmd
:: Unicode value to UTF-8 character
:: Frank P. Westlake, 2009-07-20
:: The following presentation has been formatted to fit your screen.
@Echo OFF
SetLocal ENABLEEXTENSIONS

For /F "tokens=2 delims=:" %%a in ('CHCP') Do Set /A "CP=%%a"
If /I "%1" EQU "/?" (
Echo.Converts a Unicode value to a UTF-8 multi-byte character.
Echo.
Echo. %0 [/VAR=variable_name] /CI or U-value [U-value] [U-value] [...]
Echo.
Echo. U-Values
Echo. The Unicode values either in hexadecimal (with the
Echo prefix '0x'^) or in decimal.
Echo.
Echo. /CI Print or set the three character content indicator
Echo. (ï»¿^) which often begins a UTF-8 File to indcate
Echo. its content.
Echo.
Echo. /VAR=variable_name
Echo. Optional name of a variable to set with the result.
Echo.
Echo.If a variable name is not given the UTF-8 characters will be
Echo.echoed to standard output.
Echo.
Echo.Example (copy and paste into the console window^):
Echo. Echo OFF
Echo. %0 0x20AC /var=Euro
Echo. %0 0x201C /var=LQ
Echo. %0 0x201D /var=RQ
Echo. %0 0x2192 /var=ARROW
Echo. %0 /var=Russian 1047 1072 1088 1077 1075 1080 1089 1090
Echo. Echo %%LQ%%The cost is 14 %%Euro%%.%%RQ%% ^>%%TEMP%%\file.txt
Echo. Echo Russian %%ARROW%% %%RUSSIAN%% ^>^>%%TEMP%%\file.txt
Echo. CHCP 65001 ^>NUL:^&CMD /CTYPE %%TEMP%%\file.txt^&CHCP %CP% ^>NUL:
Echo. ERASE %%TEMP%%\file.txt
Echo. Echo ON
Echo. ::
Goto :EOF
)

Set
"HighBit=€ ‚ƒ„…†‡ˆ‰Š‹Œ Ž ‘’“”•–—˜™š›œ žŸ ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüý"
Set "VAR="&Set "Answer="
:Loop
If /I "%1" EQU "/CI" (
Set "Answer=%Answer%ï»¿"
SHIFT
If "%2" NEQ "" Goto :Loop
Goto :EOF
)
If /I "%1" EQU "/VAR" (
Set "VAR=%2"
SHIFT
SHIFT
If "%3" NEQ "" Goto :Loop
Goto :Return
)
Set "byte1=0"&Set "byte2="&Set "byte3="&Set "byte4="
Set /A U=%1
Set /A T=%U%
If %U% LSS 0x80 (Echo.%0: Use ASCII characters directly. >&2 & Goto :EOF)
If %U% GTR 0x10FFFF (Echo.%0: Illegal character in UTF-8. >&2 & Goto :EOF)

If %U% GEQ 0x010000 Set /A "byte4=T&0x3F, T>>=6, byte1|=0x10"
If %U% GEQ 0x000800 Set /A "byte3=T&0x3F, T>>=6, byte1|=0x20"
If %U% GEQ 0x000080 Set /A "byte2=T&0x3F, T>>=6, byte1|=0x40"
Set /A "byte1|=T"

Call Set "byte1=%%HighBit:~%byte1%,1%%"
If DEFINED byte2 Call Set "byte2=%%HighBit:~%byte2%,1%%"
If DEFINED byte3 Call Set "byte3=%%HighBit:~%byte3%,1%%"
If DEFINED byte4 Call Set "byte4=%%HighBit:~%byte4%,1%%"

Set "Answer=%Answer%%byte1%%byte2%%byte3%%byte4%"
SHIFT
If "%1" NEQ "" Goto :Loop

:Return
IF DEFINED VAR (
EndLocal
Set "%VAR%=%Answer%"
) Else (
Echo.%Answer%
)
Goto :EOF
:: END SCRIPT :::::::::::::::::::::::::::::::::::::::

Herbert Kleebauer

unread,

Jan 22, 2014, 11:18:29 AM1/22/14

to

On 22.01.2014 14:58, Frank P. Westlake wrote:

>> "ASCII 1BYTE" and "2BYTE special character" <-- i make it..;;
>
>> "ascii2dec code souce" is not the final source.

> I can't tell what you want to convert:

I think he is not asking for help but he wants to
show us his code which solves a problem. But I
don't understand what the problem is.

JJ

unread,

Jan 22, 2014, 12:38:25 PM1/22/14

to

Don't use batch file to directly read/parse binary files.
Use VBScript instead.

Frank P. Westlake

unread,

Jan 22, 2014, 2:33:02 PM1/22/14

to

On 01/21/2014 01:18 AM, mokomoji wrote:

https://ko.wikipedia.org/wiki/UTF-8

You need to find bytes greater than 127 (ASCII > 127). If a byte is 128
or higher (BYTE >= 128) then it is BYTE 1 of multi-byte. BYTE 1 tells
you how many bytes total for the multi-byte character.

This script at the bottom of this message uses DEBUG but maybe it will
help. Here it the part which uses the BYTE 1 to determine what to do:

For /F "skip=1 tokens=1,2 delims=: " %%a in (
'FC /b %FileIn% %TmpFile%.fc') Do (
Set /A "byte=0x%%b"
If !byte! LSS 0x80 ( REM ASCII, Byte 1 of 1
Set /A "U=byte, b=1, n=1"
) Else If !byte! LSS 0xC0 ( REM Byte 2, 3 or 4
Set /A "U<<=6, byte&=0x3F, U|=byte, b+=1"
) Else If !byte! LSS 0xC2 ( REM Overlong
Echo.%Me%: Aborting. Overlong encoding of ASCII character at %%ah. >&2
Exit /B 1
) Else If !byte! LSS 0xE0 ( REM Byte 1 of 2
Set /A "U=byte&0x1F, n=2, b=1"
) Else If !byte! LSS 0xF0 ( REM Byte 1 of 3
Set /A "U=byte&0x0F, n=3, b=1"
) Else If !byte! LSS 0xF5 ( REM Byte 1 of 4
Set /A "U=byte&0x07, n=4, b=1"
) Else ( REM Restricted or undefined.
Echo.%Me%: Aborting. Restricted or undefined character at %%ah. >&2
Exit /B 2
)
If !b! EQU !n! (
If !U! GTR 0xFFFF (
Set /A "U-=0x10000, UL=U&0x3FF, UL|=0xDC00"
Set /A "U>>=10, U&=0x3FF, U|= 0xD800"

Set /A "L=U&0x00FF, U>>=8, n=0"
TYPE ASCII??.!L! >>%FileOut% 2>NUL:
TYPE ASCII??.!U! >>%FileOut% 2>NUL:

Set /A "L=UL&0x00FF, UL>>=8"
TYPE ASCII??.!L! >>%FileOut% 2>NUL:
TYPE ASCII??.!UL! >>%FileOut% 2>NUL:
) Else (
Set /A "L=U&0x00FF, U>>=8, n=0"
TYPE ASCII??.!L! >>%FileOut% 2>NUL:
TYPE ASCII??.!U! >>%FileOut% 2>NUL:
)
)
)

Frank

:: BEGIN FILE ::::::::::::::::::::::::::::::::::::::::::::::::::::
:: UTF8to16.cmd
:: Write a UTF-16 file from a UTF-8 file.
:: Frank P. Westlake, 2009-07-27
@Echo OFF
SetLocal ENABLEEXTENSIONS ENABLEDELAYEDEXPANSION

If /I "%1" EQU "/?" (

Echo.Writes UTF-16LE from a UTF-8 file.
Echo.
Echo. %0 filein fileout
Echo.
Echo. filein Name of the new UTF-8 file.
Echo. fileout Name of the new UTF-16 file.
Echo.
Echo.Example:
Echo. %0 UTF8.txt UTF16.txt
Goto :EOF
)
Set "Me=%~n0"
Set "FileIn="
Set "FileOut="
:: Alterable environment:
Set "MyDir=%temp%\ASCII"
Set "TmpFile=%TEMP%\%Me%"
:: End alterable environment
If "%1" EQU ":WriteBinaryFiles" (Shift & Goto :WriteBinaryFiles)
:args
If /I "%1" EQU "/NOCI" (
Set "CI="
Shift
) Else If DEFINED FileOut (
Echo.%Me%: Too many filenames. >&2
Goto :EOF
) Else If DEFINED FileIn (
Set "FileOut=%1"
Shift
) Else (
Set "FileIn=%1"
Shift
)
IF "%1" NEQ "" Goto :args
If "%FileIn%" EQU "" (
Set /P "FileIn=%Me%: Please enter the name of the existing UTF-8
file: " >&2
)
If "%FileOut%" EQU "" (
Set /P "FileOut=%Me%: Please enter the name of the new UTF-16 file:
" >&2
)
If "%FileIn%" EQU "" (Echo.%Me%: Aborting. Need input filename. >&2 &
Goto :EOF)
If "%FileOut%" EQU "" (Echo.%Me%: Aborting. Need output filename. >&2
& Goto :EOF)
For %%f in (%FileIn%) Do (Set "FileIn=%%~ff" & Set "fs=%%~zf")
For %%f in (%FileOut%) Do (Set "FileOut=%%~ff")
Set "TmpFile=%TEMP%\%~n0.tmp"
Set "HX=0123456789ABCDEF"
Start "" /wait /MIN %Me% :WriteBinaryFiles %MyDir%
ChDir /d %MyDir%
Set "FSUtil=1"
If NOT EXIST FSUTIL.EXE (
For %%f in (FSUTIL.EXE) Do (
If NOT EXIST %%~$PATH:f (
Set "FSUtil="
)
)
)
If DEFINED FSUTIL (
FSUtil FILE CREATENEW %TmpFile%.fc %fs% >NUL:
) Else (
TYPE NUL: >%TmpFile%.fc
For /L %%i in (1 1 %fs%) Do TYPE ASCII00.0 >>%TmpFile%.fc
)
Set /a b=-1, U=0, n=0
Type NUL: >%FileOut%
For /F "skip=1 tokens=1,2 delims=: " %%a in (
'FC /b %FileIn% %TmpFile%.fc') Do (
Set /A "byte=0x%%b"
If !byte! LSS 0x80 ( REM ASCII, Byte 1 of 1
Set /A "U=byte, b=1, n=1"
) Else If !byte! LSS 0xC0 ( REM Byte 2, 3 or 4
Set /A "U<<=6, byte&=0x3F, U|=byte, b+=1"
) Else If !byte! LSS 0xC2 ( REM Overlong
Echo.%Me%: Aborting. Overlong encoding of ASCII character at %%ah. >&2
Exit /B 1
) Else If !byte! LSS 0xE0 ( REM Byte 1 of 2
Set /A "U=byte&0x1F, n=2, b=1"
) Else If !byte! LSS 0xF0 ( REM Byte 1 of 3
Set /A "U=byte&0x0F, n=3, b=1"
) Else If !byte! LSS 0xF5 ( REM Byte 1 of 4
Set /A "U=byte&0x07, n=4, b=1"
) Else ( REM Restricted or undefined.
Echo.%Me%: Aborting. Restricted or undefined character at %%ah. >&2
Exit /B 2
)
If !b! EQU !n! (
If !U! GTR 0xFFFF (
Set /A "U-=0x10000, UL=U&0x3FF, UL|=0xDC00"
Set /A "U>>=10, U&=0x3FF, U|= 0xD800"

Set /A "L=U&0x00FF, U>>=8, n=0"
TYPE ASCII??.!L! >>%FileOut% 2>NUL:
TYPE ASCII??.!U! >>%FileOut% 2>NUL:

Set /A "L=UL&0x00FF, UL>>=8"
TYPE ASCII??.!L! >>%FileOut% 2>NUL:
TYPE ASCII??.!UL! >>%FileOut% 2>NUL:
) Else (
Set /A "L=U&0x00FF, U>>=8, n=0"
TYPE ASCII??.!L! >>%FileOut% 2>NUL:
TYPE ASCII??.!U! >>%FileOut% 2>NUL:
)
)
)
For %%x in (fc) Do Erase %TmpFile%.%%x
Goto :EOF
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:WriteBinaryFiles path
SetLocal
MkDir %MyDir% >NUL: 2>&1
ChDir /d %MyDir%
FOR /L %%i in (0 1 0xFF) Do If NOT EXIST ASCII*.%%i (
Set /A "h1=(%%i&0xF0)>>4, h2=(%%i&0x0F)"
Call Set "h=%%HX:~!h1!,1%%%%HX:~!h2!,1%%"
(
CALL Echo N ascii%%h%%.%%i
CALL Echo E 0000 %%h%%
Echo R CX
Echo 1
Echo W 0
Echo Q
) | DEBUG >NUL:
)
Exit
Goto :EOF
:: END FILE ::::::::::::::::::::::::::::::::::::::::::::::::::::::

frank.w...@gmail.com

unread,

Jan 22, 2014, 2:49:22 PM1/22/14

to

From "Frank P. Westlake" :
>This script at the bottom of this message uses DEBUG ...

DEBUG can be replaced by CERTUTIL, but I can't do that
without a Windows computer. It can also be replaced by
CSCRIPT but I think you don't want either CSCRIPT or
DEBUG.

Maybe you don't want CERTUTIL either; maybe you want to
find bytes > 127 using 'IF "%BYTE%' EQU'. I don't think
it can be done, but I am not certain I recall that
correctly.

Frank

mokomoji

unread,

Jan 22, 2014, 3:11:07 PM1/22/14

to

http://msdn.microsoft.com/en-US/goglobal/cc305154.aspx
chcp 949 xp sp2
41 = U+0041 : LATIN CAPITAL LETTER A
42 = U+0042 : LATIN CAPITAL LETTER B
43 = U+0043 : LATIN CAPITAL LETTER C
44 = U+0044 : LATIN CAPITAL LETTER D
45 = U+0045 : LATIN CAPITAL LETTER E
46 = U+0046 : LATIN CAPITAL LETTER F
47 = U+0047 : LATIN CAPITAL LETTER G

u'a ascii code chcp437
65 = A
97 = a
http://blog.naver.com/mokomoji/130066940193

mokomoji

unread,

Jan 22, 2014, 3:17:09 PM1/22/14

to

SBCS (Single Byte Character Set) Codepages1250 (Central Europe)

1251 (Cyrillic)

1252 (Latin I)

1253 (Greek)

1254 (Turkish)

1255 (Hebrew)

1256 (Arabic)

1257 (Baltic)

1258 (Vietnam)

874 (Thai)

Top of page

DBCS (Double Byte Character Set) CodepagesIn these graphical representations, leadbytes are indicated by light gray background shading. Each of these leadbytes hyperlinks to a new page showing the 256 character block associated with that leadbyte. Unused leadbytes are identified by a darker gray background.
932 (Japanese Shift-JIS)

936 (Simplified Chinese GBK)

949 (Korean)

950 (Traditional Chinese Big5)

http://msdn.microsoft.com/en-us/goglobal/bb964654

Sorry. I'm not English.
I'll characters except Korea.
Think of Chinese characters or Japanese characters.
Chinese characters or Japanese characters are not unicode.
Separate "2byte Font" is applied. in cmd mode
"Number" and "English" is the ASCII code.
However, the character of the country 2byte used.

Parser for English characters and words
Space parser
Chinese Language, Japanese characters are divided into each character.

Example:
input
一本语 japanese japanese 一本语一本语

ouput
一 ^ 本 ^ 语 ^ ^ japanese ^ ^ japanese ^ ^ 一 ^ 本 ^ 语 ^ 一 ^ 本 ^ 语

One thing you have to understand
All "Unicode" and I think that is.
1byte 2byte characters as characters and written.

The file is not Unicode, ascii code documentation

frank.w...@gmail.com

unread,

Jan 22, 2014, 4:25:47 PM1/22/14

to

From mokomoji :

>SBCS (Single Byte Character Set)

>DBCS (Double Byte Character Set)

>http://msdn.microsoft.com/en-us/goglobal/bb964654

OK, I understand that now. Those old code pages were not
absorbed into Unicode as I thought; they have little
resembelence with Unicode. And being English I never
looked at DBCS to see what it really is -- I assumed it
was always 16-bit Unicode (UTF?16). Good thing I'm not a
programmer!

It seems to me that the only way for you to get the
decimal value of an ASCII character is with one of
these:

FC
DEBUG
CSCRIPT
CERTUTIL

Those procedures should be given by Dr. Kleebauer or
someone else here -- not by me. Do you want one of those
procedures?

Am I still wrong?

Frank

mokomoji

unread,

Jan 23, 2014, 10:20:13 AM1/23/14

to

i said ms949...

949 -> So called in Korea. - MS949 CODE
Is a freak. International standards are not
freak code is used. Microsoft Company

I think. The basic character of the world cmd
the website below
http://msdn.microsoft.com/en-us/goglobal/bb964654

The code that we use, not unicode
I think the only code of Microsoft Corporation

Dr. Kleebauer Who is he?

Thank you for having to listen to my questions

Frank P. Westlake

unread,

Jan 23, 2014, 10:26:19 AM1/23/14

to

On 01/23/2014 07:20 AM, mokomoji wrote:
> Thank you for having to listen to my questions

I'm sorry I wasn't able to help.

Frank

Herbert Kleebauer

unread,

Jan 23, 2014, 12:08:30 PM1/23/14

to

On 23.01.2014 16:20, mokomoji wrote:
> i said ms949...
>
> 949 -> So called in Korea. - MS949 CODE
> Is a freak. International standards are not
> freak code is used. Microsoft Company
>
> I think. The basic character of the world cmd
> the website below
> http://msdn.microsoft.com/en-us/goglobal/bb964654
>
> The code that we use, not unicode
> I think the only code of Microsoft Corporation

I think the problem is, that we don't understand what
exactly your problem is. So maybe you can answer a few
questions:

--
Do you need a solution only for your PC (Windows XP, 32 bit,
Korean version) or for different operating system versions
(XP, Vista, Win7, Win8, 32 and 64 bit versions).

--
You have a batch variable which contains a text string
encoded in the Double Byte Character Codepage 949
(if the 8th bit of a byte is 0, then the byte represents
the ascii character, if the 8th bit is 1 then the remaining
7 bits are a pointer to a code table and the next byte is
an index into this code table).

How is this text string generated. Is it a user input, the
output of a command or read in from a file?

What exactly do you want to do with this text string?

Because maybe nobody here has a Korean cmd version: what
happens when you extract a single character from the string
using the form: %var:~0,1%
Do you get the two bytes representing the first character or
only the first byte?

Maybe it is easier to echo the string in a temporary file
and process it as a binary file. This way it wouldn't matter
whether the cmd is a Korean or English version.

mokomoji

unread,

Jan 29, 2014, 12:05:41 PM1/29/14

to

Thank you