Re: Mysterious accents problem when using the DIR command...

75 views
Skip to first unread message

Herbert Kleebauer

unread,
Jul 7, 2005, 3:41:58 AM7/7/05
to
KILOWATT wrote:

> I'm generating a directory listing with the following command:
>
> dir /s /b D:\directorytolist >c:\windows\desktop\listedfiles.txt
>
> This created file opened in Notepad shows that every filenames with their
> fullpath are corectly listed, everything is fine except that if the file
> happens to be larger than the maximum size Notepad can read in Windows98
> (about 64KB ?), it's opened with Wordpad. Then, all the accented characters
> aren't displayed properly. If i change the redirect to produce a file named
> listedfiles.htm instead of listedfiles.txt then i have the same results
> when InternetExplorer displays the file...accented characters are wrong.
> I've tried to send the text file to some friends using Win2000 and XP. They
> can't see the accented characters properly...even in notepad! To add to the
> confusion, on my side... in the DOS box, if use the following command:
> type c:\windows\desktop\listedfiles.htm to show the content of the htm file
> in the dos box, then the accented characters are displayed properly! Very
> interesting...well...confusing! :-) TIA for any replies!

DOS and Windows use a different character encoding. If possible
use only 7 bit ASCII characters for file names. Otherwise you have
convert your list file to Windows encoding. Here a simple batch
file you can use for this purpose. You can specify the encoding
for any of the 256 bytes by modifying the table below (in the
example below only the German Umlauts äÄöÖüÜß are converted).

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
@echo off
echo Bj@jzh`0X-`/PPPPPPa(DE(DM(DO(Dh(Ls(Lu(LX(LeZRR]EEEUYRX2Dx=>_x_.com
echo 0DxFP,0Xx.t0P,=XtGsB4o@$?PIyU WwX0GwUY Wv;ovBX2Gv0ExGIuht6>>_x_.com
echo EB{zFBwxzVxdx@?_k?A_{ok?KuHCrpkO_CA?rE{OICnBCa?NaAG{iQej]t>>_x_.com
echo JtJy\x@x@G~s?GLZp{@x`o_St`H@JbaVjns@Jb@z?H`L?Ks@x[sBa10xxx>>_x_.com

echo 000102030405060708090a0b0c0d0e0f 101112131415161718191a1b1c1d1e1f>>_x_.com
echo 202122232425262728292a2b2c2d2e2f 303132333435363738393a3b3c3d3e3f>>_x_.com
echo 404142434445464748494a4b4c4d4e4f 505152535455565758595a5b5c5d5e5f>>_x_.com
echo 606162636465666768696a6b6c6d6e6f 707172737475767778797a7b7c7d7e7f>>_x_.com
echo 80fc8283e485868788898a8b8c8dc48f 90919293f695969798d6dc9b9c9d9e9f>>_x_.com
echo a0a1a2a3a4a5a6a7a8a9aaabacadaeaf b0b1b2b3b4b5b6b7b8b9babbbcbdbebf>>_x_.com
echo c0c1c2c3c4c5c6c7c8c9cacbcccdcecf d0d1d2d3d4d5d6d7d8d9dadbdcdddedf>>_x_.com
echo e0dfe2e3e4e5e6e7e8e9eaebecedeeef f0f1f2f3f4f5f6f7f8f9fafbfcfdfeff>>_x_.com

:: AE ae OE oe UE ue ss
:: 8E 84 99 94 9A 81 E1
:: C4 E4 D6 F6 DC FC DF

_x_.com <%1 >%2
del _x_.com
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Ted Davis

unread,
Jul 7, 2005, 9:18:20 AM7/7/05
to
On Thu, 07 Jul 2005 09:41:58 +0200, Herbert Kleebauer <kl...@unibwm.de>
wrote:

>DOS and Windows use a different character encoding. If possible
>use only 7 bit ASCII characters for file names. Otherwise you have
>convert your list file to Windows encoding. Here a simple batch
>file you can use for this purpose. You can specify the encoding
>for any of the 256 bytes by modifying the table below (in the
>example below only the German Umlauts äÄöÖüÜß are converted).

Aside from the usual cautions about executing unknown binaries found
in newsgroups, it must also be noted that the premis behind this one
is (almost) complete nonsense. Reality is a combination of what
William said and what Todd said.

Notepad is notorious for changing character encodings ... and writig
the changes back to the fiel without notifying you - there are many
*very* much better text editors available. There is no such thing as
"Windows encoding" - later versions allow Unicode, but otherwise
character coding varies from font to font. This can be very annoying
- Unicode is supposed to take care of the problem, but it is useless
in batch files.

--
T.E.D. (tda...@gearbox.maem.umr.edu)
SPAM filter: Messages to this address *must* contain "T.E.D."
somewhere in the body or they will be automatically rejected.

Herbert Kleebauer

unread,
Jul 8, 2005, 2:07:24 PM7/8/05
to
Ted Davis wrote:

> it must also be noted that the premis behind this one
> is (almost) complete nonsense.

> There is no such thing as


> "Windows encoding" - later versions allow Unicode, but otherwise
> character coding varies from font to font. This can be very annoying

It doesn't matter how you call things. If you want to store a
character then you have to assign it a binary number. And if
you want that an A is displayed as an A in any text character font,
than you have to do this in a consistent way. You will find a
table of the MacRomanEncoding or WinAnsiEncoding in the appendix
of the pdf specification which you can download from Adobes
web site. But this WinAnsiEncoding is different from the DOS
code pages. Therefore when you open a text file in Microsoft Word,
you are asked whether you want to open it as "text file" or as
"MSDOS text file". The same conversion which is done by Word
internally you can also do with the posted batch program by
specifying the conversion table. I use it to convert the German
umlauts from DOS to Windows and never had any problem.

Norman L. DeForest

unread,
Jul 9, 2005, 4:36:48 AM7/9/05
to

On Fri, 8 Jul 2005, Herbert Kleebauer wrote:

> Ted Davis wrote:
>
> > it must also be noted that the premis behind this one
> > is (almost) complete nonsense.
>
> > There is no such thing as
> > "Windows encoding" - later versions allow Unicode, but otherwise
> > character coding varies from font to font. This can be very annoying
>
> It doesn't matter how you call things. If you want to store a
> character then you have to assign it a binary number. And if
> you want that an A is displayed as an A in any text character font,
> than you have to do this in a consistent way. You will find a
> table of the MacRomanEncoding or WinAnsiEncoding in the appendix

^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^


> of the pdf specification which you can download from Adobes
> web site. But this WinAnsiEncoding is different from the DOS
> code pages. Therefore when you open a text file in Microsoft Word,
> you are asked whether you want to open it as "text file" or as
> "MSDOS text file". The same conversion which is done by Word
> internally you can also do with the posted batch program by
> specifying the conversion table. I use it to convert the German
> umlauts from DOS to Windows and never had any problem.

For national character sets, a more complete set of references would be
the files at:
http://ftp.unicode.org/Public/MAPPINGS

--
Windows is *not* a "Toy OS". A screenshot of my current desktop:
http://www.chebucto.ns.ca/~af380/MyDeskTop-Jun-22-2005.gif
Have your own desktop just like it, includes icons for A to Z and both an
empty and a full toilet: http://www.chebucto.ns.ca/~af380/EtchASketch.zip


Reply all
Reply to author
Forward
0 new messages