Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

special characters in html (&#162)

1 view
Skip to first unread message

Unknown

unread,
May 30, 1996, 3:00:00 AM5/30/96
to

What is the name for the table of special characters and symbols
used to get the copyright symbol, small 1/2, etc.

Do all browsers support the use of &#___ ?

Is it font specific ?

tim glaesemann
t...@circa.com


Alan J. Flavell

unread,
May 30, 1996, 3:00:00 AM5/30/96
to

On Thu, 30 May 1996, it was written:

> What is the name for the table of special characters and symbols
> used to get the copyright symbol, small 1/2, etc.

I don't exactly understand your question, but there are two sections
at the end of the HTML2.0 spec, see:

http://www.w3.org/pub/WWW/MarkUp/html-spec/html-spec_toc.html

which will help you. Section 13 shows the numerical values; section
14 shows named entities. (These are Appendix A and B in an earlier
printed version of the spec.)

Beware, though, that although pretty well all browsers support the
accented letters as named entities, the remaining named entities are not
supported by all browsers yet.

Note that section 13 is a part of the standard, whereas section 14 is only
"proposed", although in fact coverage for the entity names of the Latin-1
accented letters was already normal by that time: many browser makers
buckled down and supported the rest of the proposal, but Netscape only got
around to it with their ATLAS version, nor does MS IE 2.0 support the
missing ones.

The safest authoring strategy at this time IMO is:

- For the HTML metachars < > & (and " where needed), use
&lt; &gt; &amp; (and &quot;)

- For Latin-1 e.g accented letters, use the name e.g &eacute;

- For other entities, use the &#number; representation

Copyright (circle-C) and Registered (circle-R) are also well
covered by now (&copy; &reg;), as is the no-break space (&nbsp;)
if you aren't too worried about older browser versions.

For more discussion than you probably want at this stage, you can
consult my briefing and report:

http://ppewww.ph.gla.ac.uk/%7Eflavell/iso8859/

> Do all browsers support the use of &#___ ?

Not "all browsers", no, but all browsers that people are likely
to be using today will indeed honour &#number; representation -
it is a requirement of the HTML2.0 standard.

Please terminate your &-entities with a semicolon - it isn't
always essential, but it is never wrong, and it's good practice
to always add the semicolon IMHO.

> Is it font specific ?

In theory no: the numbered points mean what they mean, regardless
of any details of the browser configuration.

In practice you may find that browsers don't behave that way when
non-standard fonts are selected. All Mac-standard fonts are
non-standard in this sense, unfortunately, and will mis-represent
at least fourteen characters of the repertoire. Of course, in
this mode the browser is no longer in compliance with the HTML2.0
standard - but some communities of readers have agreed amongst
themselves on particular conventions, as well they might on the
"World" wide web, while waiting for a standardised way to do what
they need.

Some browsers deliberately support other character sets (e.g
Hebrew or Cyrillic) by font tricks, but they are then not compliant
with the HTML2.0 standard. There is an official definition (ISO-10646)
of how to represent other characters, and HTML intends to use
that standard, but browser coverage for this has not really been
rolled out yet.

best regards


Heikki Kantola

unread,
Jun 9, 1996, 3:00:00 AM6/9/96
to

Alan J. Flavell <fla...@mail.cern.ch> provided the following
for the eyes of comp.infosystems.www.authoring.html:

>The safest authoring strategy at this time IMO is:
>
>- For the HTML metachars < > & (and " where needed), use
> &lt; &gt; &amp; (and &quot;)
>
>- For Latin-1 e.g accented letters, use the name e.g &eacute;

Here I beg to have have different opinion: I prefer using the ISO Latin 1
characters as is if possible (granted that getting those characters out of
some keyboards might be bit hard) because it makes the code much more
readable.

>- For other entities, use the &#number; representation

--
Heikki "Hezu" Kantola, <Heikki....@IKI.FI>
Lähettämällä mainoksia tai muuta asiatonta sähköpostia yllä olevaan
osoitteeseen sitoudut maksamaan oikolukupalvelusta FIM500 alkavalta
tunnilta.


Dave the Magni

unread,
Jun 10, 1996, 3:00:00 AM6/10/96
to

Last I read, hkan...@cc.Helsinki.FI (Heikki Kantola) opined:

>Alan J. Flavell <fla...@mail.cern.ch> provided the following
>for the eyes of comp.infosystems.www.authoring.html:
>>The safest authoring strategy at this time IMO is:
>>
>>- For the HTML metachars < > & (and " where needed), use
>> &lt; &gt; &amp; (and &quot;)
>>
>>- For Latin-1 e.g accented letters, use the name e.g &eacute;
>
>Here I beg to have have different opinion: I prefer using the ISO Latin 1
>characters as is if possible (granted that getting those characters out of
>some keyboards might be bit hard) because it makes the code much more
>readable.

You're welcome to your different working style, of course, but there's a
small advantage to using the name: If a browser does not have the glyph
available, it may simply display the source. If you enter the
characters directly, they may show up on the display as mysterious
numbers which might not make any sense to the reader, but if you write
the name it will suggest your intention much more clearly.

--
dar...@tezcat.com
http://www.tezcat.com/~darsal/
<blink><font color="#ff0000">12:00</font></blink>

Alan J. Flavell

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to

On 9 Jun 1996, Heikki Kantola wrote:

> Alan J. Flavell <fla...@mail.cern.ch> provided the following

> >- For Latin-1 e.g accented letters, use the name e.g &eacute;


>
> Here I beg to have have different opinion: I prefer using the ISO Latin 1
> characters as is if possible

You're right - my answer was too brief. I already have a longer
discussion of this in my character code briefing - where I made a point
of saying that I was chiefly addressing readers who work in an English
speakign environment, and it would be understandable that other authors
might reach different conclusions. The issues are briefly these...

In an English-speaking environment, many authors cannot remember how
to key in the accented letters (yes, I am speaking for myself also).
So the &entity; mechanism is useful to them anyway.

In a language environment where accented letters are the norm, this
will presumably not be a problem, so I would not argue with you
doing whatever seems natural to you.

A standards-compliant browser will display all three representations
(8-bit characters, &entityname; , &#number; ) correctly, if you
send it out correctly.

However, when you come to transfer files (see for example this one:

http://ppewww.ph.gla.ac.uk/~flavell/bahn320/bahn320.txt

which the author's shareware licensing conditions demand should be
provided unaltered), you have the problem of incompatible storage codes.
The author prepared that document in a DOS code page, and it cannot be
read properly when treated as ISO-8859-1. However, as an HTML file it
could have been "entified" and in that way expressed in purely 7-bit
US-ASCII, and represented by the same bit patterns irrespective of
whether it is stored on DOS, Mac, unix, or even in KOI8-R...

To sum up again:

- if you can handle accented characters natively, then don't let
me stop you - that hadn't been my intention

- take care when you transfer files. Especially when offering files
to people who may be using a different storage code

- when transferring files, the safest procedure is to "entify" the
8-bit characters. But if you get good results without doing that, fine!

best regards


Claudio Estrugo

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to

dar...@tezcat.com (Dave the Magni) wrote:


>You're welcome to your different working style, of course, but there's a
>small advantage to using the name: If a browser does not have the glyph
>available, it may simply display the source. If you enter the
>characters directly, they may show up on the display as mysterious
>numbers which might not make any sense to the reader, but if you write
>the name it will suggest your intention much more clearly.

And make flames to Netscape, because it doesn't support &deg; or
&acute. It sometimes makes my pages look strange...

Micro-BIOS


0 new messages