Is there anybody with Warp 3 and codepage 850, 437 who has found an
solution?
Thanks,
A
Yet that is what you have to do. The HTML entity é is the only
thing guaranteed to be translated as "e accent aigu" by all the operating
systems. If you put in character 130, which is é in cp 850/437 and upload
the page, your ftp program should automatically translate it to character
233, which is é in Latin-1. Your Web browser if it does not translate back
will see 233 and interpret it as character 233 in cp 850, that is Ú. On a
Macintosh it would show up as È and on NeXT as Ø.
Pierre
--
Pierre Jelenc | The Spring Gig: Friday May 8th
Home Office Records | - Lorijo Manley - Trina Hamlin -
http://www.web-ho.com/ | - Polygraph Lounge - Homer Erotic -
|at CB's Gallery, 313 Bowery, New York City
How must one set up one's system so that all the applications -EPM,
Netscape and also the windows editors obey the same rule?
Only for a final result and as a presentation to the outside world,
the special ASCII signs should be translated to the special HTML code
with & and the ;.
A
When you edit text written to ISO 8859-1, CP 1004 will get it right.
Make a small command file to start the editor in that codepage.
It only needs to contain
CHCP 1004
EPM %1 %2 %3 %4 %5 %6 %7 %8 %9
and can be used as your standard text editor in for instance Netscape.
Netscape does not use Latin-1 in OS/2, it uses whatever code page you have
installed. The HTML code that you get from the net assumes Latin-1, but
the communication applications translate automatically back and forth
completely transparently.
I am connected to my Unix account via Kermit, right now. If I type ALT-130
I will see e-acute on my screen, yet if I use a binary editor on the file
I am writing, it will tell me that I typed character 233. It's all done in
the background:
OS/2 ===Kermit===> Unix ===Kermit===> OS/2
chr. 130 cp850 => Latin-1 chr.233 Latin-1 => cp850 chr. 130
Pierre
--
Tired of TV reruns? Help is on the way!
New York City | Home Office
Beer Guide | Records
http://www.nycbeer.org/ | http://www.web-ho.com/
For HTML, the only possible encoding is ISO Latin 1. This is defined
as such in the HTML specification. Thus, Netscape has no choice in
this matter. They must interpret every document as a Latin 1 document,
unless the document uses a "charset" command to override this.
There's no such thing as "higher ASCII signs" when it comes to HTML
documents.
> How must one set up one's system so that all the applications -EPM,
> Netscape and also the windows editors obey the same rule?
You could change to a codepage that matches ISO Latin 1, although I
have no idea which. Normally I always write my documents in the
ASCII character set, for accented characters I always use the SGML
entities (the thingies like &, é or à).
> Only for a final result and as a presentation to the outside world,
> the special ASCII signs should be translated to the special HTML code
> with & and the ;.
They're actually called "entities" and are an SGML invention. They
usually refer to characters that aren't in the ASCII character set,
although there also are entities that are (like & for ampersand).
--
\/ Arnoud "Galactus" Engelfriet - gala...@stack.nl This space
5th year Business & Computing Science student left blank
URL: http://www.stack.nl/~galactus/ PGP: 0x461A1A35 intentionally.
In my not so humble opinion this thread is a big pro for the BeOS and
any upcoming JavaOS. The BeOS already (in this early stage of
development) has unicode support through-out the system and an
upcoming JavaOS probably will since the latest JDK:s use it.
Many languages have alphabets with more letters than in English,
including (from personal usage) Swedish, Norwegian, Danish and German.
It's not acceptable having to write everyday words in these and other
languages using "entities", imagine yourself having to write the
simple verb "is" like "&ae;r" instead of "är" as in Swedish.
When it comes to IBM-ASCII (8-bit) and the so called multilingual
codepage 850 (which is the OS/2 default codepage for the U.S. and most
European countries, including Sweden) the last letters of the swedish
alphabet (Å, Ä, and Ö) are not ordered as they appear in the alphabet.
This means among other things that you can't use simple string
comparison ("strcmp" in the C programming language) to sort words,
addresses, names etc (fortunately OS/2 takes care of this problem when
using its presentation manager and system controls [like drop down
lists]). The unnecessary mess is caused by american/english ignorance
of the world around them and pushing it onto the rest of us with a
great deal of arrogance.
I had to write this because I hate to see a good people behaving
stupidly and I love good technology like the BeOS (an extremly
promising new os developed by the small american Be corporation),
which solves the above mess and adds great speed to multimedia
compared to systems based on old technology like unix, windows and
even OS/2.
For more info on the BeOS follow this link: http://www.be.com As you
can read its hardware support is still quite limited, but as soon as
possible I will replace my current nt4 installation with this truly
new technology os. Everyone who has registered (or will register) my
shareware app WarpCalc (currently only available for OS/2) and is
interested in a BeOS version will get one free of charge.
I will continue using and developing for OS/2 as long as its working
and there's nothing better around. The for me emotionally crucial
point being the long overdue release of IBM's development tool Visual
Age for C++ version 4 (the current version is very old and not
up-to-date with C++ standards). If it comes out this year and is as
good as it looks, I think many more will start or at least continue
developing for OS/2.
Magnus
-
Magnus Olsson
mag...@ibm.net
Author of WarpCalc
Well, FWIW, NT has unicode support from day one, too. Does this mean
you are using it? :-)
>When it comes to IBM-ASCII (8-bit) and the so called multilingual
>codepage 850 (which is the OS/2 default codepage for the U.S. and most
>European countries, including Sweden) the last letters of the swedish
>alphabet (Å, Ä, and Ö) are not ordered as they appear in the alphabet.
>This means among other things that you can't use simple string
>comparison ("strcmp" in the C programming language) to sort words,
>addresses, names etc (fortunately OS/2 takes care of this problem when
>using its presentation manager and system controls [like drop down
>lists]). The unnecessary mess is caused by american/english ignorance
>of the world around them and pushing it onto the rest of us with a
>great deal of arrogance.
It's not that simple. The problem is that sorting order is country
dependant. A simple example: in French, one expect to see for
example 'ö' between 'o' and 'p'. But in Icelandic, you expect to see
'ö' after 'y'. Unicode does not (and cannot) solve this.
The solution here is I18N (internationalization). It is provided by
OS/2, Java, NT, etc. But it is up to the developper to use it.
Look for collation and such.
--
Martin Lafaix <laf...@ibm.net>
Team OS/2
http://www.mygale.org/~lafaix
> In article <F0bNi6i4K1KK-p...@slip139-92-64-20.st.se.ibm.net>,
> mag...@ibm.net (Magnus Olsson) wrote:
> >In my not so humble opinion this thread is a big pro for the BeOS and
> >any upcoming JavaOS. The BeOS already (in this early stage of
> >development) has unicode support through-out the system and an
> >upcoming JavaOS probably will since the latest JDK:s use it.
>
> Well, FWIW, NT has unicode support from day one, too. Does this mean
> you are using it? :-)
I didn't know that about nt, I was under the impression unicode
(atleast in its current format used by Java and BeOS) postdated nt.
The followup question is a joke, right? :-)
>
> >When it comes to IBM-ASCII (8-bit) and the so called multilingual
> >codepage 850 (which is the OS/2 default codepage for the U.S. and most
> >European countries, including Sweden) the last letters of the swedish
> >alphabet (Å, Ä, and Ö) are not ordered as they appear in the alphabet.
> >This means among other things that you can't use simple string
> >comparison ("strcmp" in the C programming language) to sort words,
> >addresses, names etc (fortunately OS/2 takes care of this problem when
> >using its presentation manager and system controls [like drop down
> >lists]). The unnecessary mess is caused by american/english ignorance
> >of the world around them and pushing it onto the rest of us with a
> >great deal of arrogance.
>
> It's not that simple. The problem is that sorting order is country
> dependant. A simple example: in French, one expect to see for
> example 'ö' between 'o' and 'p'. But in Icelandic, you expect to see
> 'ö' after 'y'. Unicode does not (and cannot) solve this.
>
Ofcourse it's not simple to solve the French problem (sorry no offence
intended:-). However, in the Swedish example above it would be very
simple since no other language uses these letters together anyway (it
wouldn't break support of any other language by simply ordering them
as they appear in the Swedish alphabet).
> The solution here is I18N (internationalization). It is provided by
> OS/2, Java, NT, etc. But it is up to the developper to use it.
That's the problem (especially for american/english developers), with
BeOS (and Java too?) unicode solves the problem described in this
thread (letters appearing in different locations in different
codepages). It should be just as easy to target for example German as
to target English. This could have been easily considered by those
responsible for IBM-ASCII and ANSI C, solved with a good and simple
8-bit code page and a decent C standard library to go with it.
One big advantage with the 16-bit unicode is that it also covers Asia,
which is where most people live anyway. In addition to unicode
through-out the system BeOS also has an object-oriented api and a
flexible and responsive (even during disk-i/o) gui. Since OS/2 is
geared towards big corporations and is the most mature pc os and BeOS
is a new and evolving system geared towards home and small businesses
with heavy multimedia demands, these two systems look like an ideal
pair for my computer.
>
> Look for collation and such.
Thanks, but it seems it's the americans who need to learn about this
the most, just about everyone else knows about it all to well.
> --
> Martin Lafaix <laf...@ibm.net>
> Team OS/2
> http://www.mygale.org/~lafaix
-
Well, the problem is not unique to France :-) For example, in my
Meriam-Webster's dictionary, Anders Jonas Ångström stands between Sir
Norman Angell and Anna Ivanovna. If it was after Huldrych
Zwingli, quite a few would miss it. It's an US dictionary,
>It should be just as easy to target for example German as
>to target English. This could have been easily considered by those
>responsible for IBM-ASCII and ANSI C, solved with a good and simple
>8-bit code page and a decent C standard library to go with it.
Well, this problem is solved now. Just use the locale-sensitive
functions when sorting/formatting text/numbers. They are available on
OS/2, Java, NT, Unixes, ... Locale-sensitive sort is much better than
using the individual characters ordinal values, anyway (what about the
special ordering for ij in dutch or ll in spanish or ...).
There's no such thing as a 'one true ordering' which suits everybody
needs.
> In article <F0bNi6i4K1KK-pn2-DuuU7dY6dqVc@localhost>,
> mag...@ibm.net (Magnus Olsson) wrote:
> >> It's not that simple. The problem is that sorting order is country
> >> dependant. A simple example: in French, one expect to see for
> >> example 'ö' between 'o' and 'p'. But in Icelandic, you expect to see
> >> 'ö' after 'y'. Unicode does not (and cannot) solve this.
> >>
> >Ofcourse it's not simple to solve the French problem (sorry no offence
> >intended:-). However, in the Swedish example above it would be very
> >simple since no other language uses these letters together anyway (it
> >wouldn't break support of any other language by simply ordering them
> >as they appear in the Swedish alphabet).
>
> Well, the problem is not unique to France :-) For example, in my
> Meriam-Webster's dictionary, Anders Jonas Ångström stands between Sir
> Norman Angell and Anna Ivanovna. If it was after Huldrych
> Zwingli, quite a few would miss it. It's an US dictionary,
I appreciate the elaboration, but my point is that even if you *only*
can solve the Swedish sorting easily it's worth doing (all that's
needed is Å before Ä instead of Ä before Å), but I feel I've written
more than enough about this now, it was only secondary to the problem
with exchanging text internationally - solved in BeOS through the use
of unicode, which also adds mathematical symbols and the scientific
symbol for ohm, and btw Å for Ångström:-).
> >It should be just as easy to target for example German as
> >to target English. This could have been easily considered by those
> >responsible for IBM-ASCII and ANSI C, solved with a good and simple
> >8-bit code page and a decent C standard library to go with it.
>
> Well, this problem is solved now. Just use the locale-sensitive
> functions when sorting/formatting text/numbers. They are available on
> OS/2, Java, NT, Unixes, ... Locale-sensitive sort is much better than
> using the individual characters ordinal values, anyway (what about the
> special ordering for ij in dutch or ll in spanish or ...).
I don't know about sorting in dutch or spanish, but I'm pretty sure I
understand your point. I'm always (doesn't say alot yet:-) using
locale sensitive sorting (you may remember I wrote earlier that OS/2
solved this with its standard controls for lists).
This just proves that the problem is, indeed, unsolvable by just arranging
the character set in a particular way. The Swedes and Finns are the only
people I know of who would want to have them in that order. You need
not go further than Norway and Denmark to find people who prefer their
sorting sequence with Ä after A, but with Å at the very end. The ISO
registry of national encodings for character sets has hundreds of
examples of this kind of thing.
Everything can work fine provided the programmer does not live in the
US, and luckily most software development has been moved to other,
more cosmopolitan parts of the world like Hungary and India. And
even in the US, some programmers seem to be catching on.
Or even German. "ae" (two characters) sort equal to "a"-umlaut (one
character).
>> Well, FWIW, NT has unicode support from day one, too. Does this mean
>> you are using it? :-)
>I didn't know that about nt, I was under the impression unicode
>(atleast in its current format used by Java and BeOS) postdated nt.
NT has always included full Unicode support from it's first version and
fully down to all API:s, kernel level and file systems.
I think NT was the first operating system in the world to fully support
Unicode (even if I'm not 100% fully sure).
Maybe someone else can confirm/deny this?
Besides NT, it's mini version CE, JavaOS and BeOS also Apple's next
generation OS Raphsody will have full support for Unicode.
In NT they have supported Unicode in TrueType fonts by using the 4WGL set,
but the next font standard from Adobe/Microsoft (who merges the technologies
from TT and PS Type 1 fonts), OpenType, will have native support and be
build on the Unicode standard.
For more info on Unicode and it's specifics see http://www.unicode.org.
Best regards,
m a r t i n | n
--
Martin Nisshagen ICQ UIN: 689662 __O verdi +
MTS Technology, Sweden -\<, callas =
martin-at-mts-se (MIME 1.0) PGP 5.5: 0x45D423AC (·)/(·) 100% pleasure
>It's not that simple. The problem is that sorting order is country
>dependant. A simple example: in French, one expect to see for
>example 'ö' between 'o' and 'p'. But in Icelandic, you expect to see
>'ö' after 'y'. Unicode does not (and cannot) solve this.
>
>The solution here is I18N (internationalization). It is provided by
>OS/2, Java, NT, etc. But it is up to the developper to use it.
Good point.
Unicode also wasn't designed to solve this problem and you shouldn't rely on
it for that purpose either (it's designed to get *one* single unified
standard for systems with 16bit character support, but to still be backwards
compatible with older 7 bit and 8 bit standards).
> In article <F0bNi6i4K1KK-pn2-BRXt51doWDJ3@localhost>,
> mag...@ibm.net (Magnus Olsson) wrote:
> |I appreciate the elaboration, but my point is that even if you *only*
> |can solve the Swedish sorting easily it's worth doing (all that's
> |needed is Å before Ä instead of Ä before Å)
>
> This just proves that the problem is, indeed, unsolvable by just arranging
> the character set in a particular way. The Swedes and Finns are the only
> people I know of who would want to have them in that order.
Å and Ä are only used together in the Swedish alphabet (therefore no
problem for any other language). But thanks to this thread I'm now
fully aware of sorting problems in other languages as well. I can add
to this with the sorting order used in German where Ä is listed under
A and Ö under O (Å is unused). It seems that though using ordinal
values for sorting isn't generally workable the C language assumes it
is. It works only in English and it could have easily worked in
Swedish as well. Sweden has many faults, but the alphabet sure isn't
one of them. Thanks to everyone for all input.
> In article <6iqb18$i...@mtts01.Teleglobe.net>,
> am...@nonsense.dds.nl (Amon, Take out the NONSENSE in your reply) wrote:
> |I don't seem to get it right. A character directly imported from an
> |HTML-page in Netscape with i.e. an "e accent aigu" is translated as
> |something like an " U with a ' " and vice versa by all of the other
> |OS/2 applications.
> ...|
> |Is there anybody with Warp 3 and codepage 850, 437 who has found an
> |solution?
> |
> The best way is to change to codepage 850, 1004. CP 437 has been
> obsolete for quite a few years now, and any codepage-specific text
> which relies on it should have been converted to CP 850.
>
> When you edit text written to ISO 8859-1, CP 1004 will get it right.
> Make a small command file to start the editor in that codepage.
> It only needs to contain
> CHCP 1004
> EPM %1 %2 %3 %4 %5 %6 %7 %8 %9
> and can be used as your standard text editor in for instance Netscape.
That's a really nice feature of the Mister ED programmer's editor,
that it has a built-in toggle to/from CP1004, which it calls the
"Windows Codepage." Any file that has to read properly under Windows
will definitely present fewer surprises if it is converted and saved
in CP1004.
Buddy Donnelly
csolid @ ibm.net
I guess your confusion stems from the fact that, while it is seen as
a separate alphabetic letter in Swedish, it is only consideret an
accented letter in Norwegian. It is used, just like á and à, but
it is quite rare. In any case, with free movement of people in Europe,
somebody with any of those national characters in their name might
move temporarily or permanently to any country. Because of that,
we even need to handle Ñ and Ç. It is a common fallacy among many
software developers to think that it is not common to write words
and sentences from different languages in the same document, like
when you quote an original text or refer to names of people and places.
It would probably not be a great hardship if my encyclopedia lacked
a reference for "Eläköön" or "Älvsborg", but I'm happy to say it has.
They appear where I would expect them in a Norwegian encyclopedia
rather than where you would want them in a Swedish one.
Vermo's suggestion seemed very useful for the lazy and stingy person
that I am: codepage 850, 1004
CHCP 1004
EPM %1 %2 %3 %4 %5 %6 %7 %8 %9
Alas, during boot up Warp III doesn't recognise / accept the part with
1004
I can imagine that in the near future letters by snailmail will be
sparse and the fax will become obsolete. All exchange of information
that needs a special layout, will be done in in HTML or its
successor. No more different rtf-, Word-, Wordpro or WordPerfect
formats, but all in a universal HTML.
Therefore we must be able to cut and paste from browser to text
editor, from resources on the Web to my document. I haven't tried it
in great extent under Windows only, but there it seemed te work
reasonably faultless. Why isn't this possible between Netscape/2 and a
native OS/2 editor?
There seems to be something basically wrong with OS/2.
Amon
|Vermo's suggestion seemed very useful for the lazy and stingy person
|that I am: codepage 850, 1004
|CHCP 1004
|EPM %1 %2 %3 %4 %5 %6 %7 %8 %9
|
|Alas, during boot up Warp III doesn't recognise / accept the part with
|1004
|
v>Maybe you have not enabled 1004 in your config.sys?
v>CODEPAGE=850,1004
v>should do it.
Unless a fixpack later than #17 added more codepage support,
codepage 1004 is not supported in Warp 3.
From the online docs:
CODEPAGE Command: xxx and yyy Parameters.
Specifies the primary code page (xxx) and the
secondary code page (yyy).
The OS/2 operating system supports these code
pages:
437 U.S.
850 Multilingual
852 Latin 2 (Czechoslovakia, Hungary, Poland, Yugoslavia)
857 Turkish
860 Portuguese
861 Iceland
862 Hebrew-speaking
863 Canada (French-speaking)
864 Arabic-speaking
865 Nordic
936 People's Republic of China
942 Japanese SAA*
944 Korean SAA
948 Taiwan
You need a late fixpack.
> Therefore we must be able to cut and paste from browser to text
> editor, from resources on the Web to my document. I haven't tried it
> in great extent under Windows only, but there it seemed te work
> reasonably faultless. Why isn't this possible between Netscape/2 and a
> native OS/2 editor?
It works perfectly well. I don't know what you're doing wrong, but there
is absolutely no problem doing a cut & paste from Netscape to any OS/2
editor.
It is not supported as the (only) codepage in fullscreen or AVIO, but it
is supported as a codepage that can be switched into at need in PM, and
does not need CONFIG.SYS to do it.
The switching method for EPM needs the codepage to be in the CONFIG.SYS,
because it takes place in a VIO session which starts EPM. According to the
author, EPM does not support codepage switching but will use the codepage
in effect when it is started. CP 1004 has been supported as an additional
codepage for a long time - it is in my documentation for the OS/2 1.2
Developer Toolkit - but it seems some of the support was not included
in the standard release. In that case, the newer fixpacks should do it.
>Amon, Take out the NONSENSE in your reply <am...@nonsense.dds.nl> writes:
>>
>> Alas, during boot up Warp III doesn't recognise / accept the part with
>> 1004
>You need a late fixpack.
You're right!
Thanks to you and all others for their advice.
A