(X)HTML entities

2 views
Skip to first unread message

Liyang HU

unread,
Jan 1, 2008, 3:17:20 AM1/1/08
to TiddlyWikiDev
Hello. Again. :)

I'd been wondering why my use of … and † and other
(X)HTML entites is leading to errors in TW (under Firefox), so I
investigated. Incidentally, this started after I renamed my TW
to .xhtml - it does, after all, claim to be XHTML 1.0 Strict, and I
figure I should probably make my server send out the correct Content-
Type.

As it turns out, XHTML 1.0 only specifies five entities, for <, >, &,
', and ". The rest need to be loaded via separate <!ENTITY ...>
declarations. (See http://www.w3.org/TR/xhtml1/#h-A2 .) The W3C's
XHTML DTDs do actually load them in, and Firefox (at least) seems to
recognise them when parsing static XHTML. But not when it's dynamic
(HTML?) content set with the (apparently non-standard) foo.innerHTML =
..., Firefox prefers to tell me to feck off instead.

A bit of creative Googling for "innerHTML innerXHTML" found me this:
http://www.stevetucker.co.uk/page-innerxhtml.php

I tried it, it merely replaced the original error with the literal
string '&hellip;'. I should have looked at the comments on the above
page first. At least the author is aware of the issue.

I thought someone should know about this. I don't have (the time to
figure out) a fix for it right now. :3

Cheers,
/Liyang
PS: this also affects the -- mdash formatter.
PPS: it would be nicer if mdash took ---, and another ndash took --
instead.
PPPS: I'm a LaTeX user. :)

FND

unread,
Jan 3, 2008, 12:06:58 PM1/3/08
to Tiddly...@googlegroups.com
> this started after I renamed my TW to .xhtml - it does, after all,
> claim to be XHTML 1.0 Strict, and I figure I should probably make my
> server send out the correct Content-Type

Unfortunately, XHTML isn't really well supported by browsers (and IE in
particular). For that reason, XHTML websites are usually served as
"text/html" (AKA "tag soup").
This is also why XHTML should be avoided for regular websites and HTML
be used instead.[1]

However, despite my earlier reservations[2], TiddlyWiki being XHTML
actually makes sense. That's because proper XHTML enables external tools
(like r4tw[3]) to parse TW documents as XML.

As for the original issue, I believe there's nothing to fix because
TiddlyWiki is to be served as HTML, not as XHTML.


-- F.


[1] some background information on this contentious issue:
http://www.webdevout.net/articles/beware-of-xhtml
http://hixie.ch/advocacy/xhtml
http://www.webstandards.org/learn/articles/askw3c/oct2003/
[2] http://trac.tiddlywiki.org/ticket/411
[3] http://randomibis.com/r4tw/

Liyang HU

unread,
Jan 3, 2008, 6:05:49 PM1/3/08
to TiddlyWikiDev
Hi there,

On 3 Jan 2008, at 17:06, FND wrote:
> Unfortunately, XHTML isn't really well supported by browsers (and IE in particular).

I'm not too bothered about IE 6. IE 7 seems to render the tiddly as
XHTML alright.

> This is also why XHTML should be avoided for regular websites and HTML
> be used instead.[1]

I know, but isn't it worse sending XHTML out as text/html? TiddlyWiki
claims to be XHTML.

http://hixie.ch/advocacy/xhtml

I read and pretty much (eventually) agreed with this something like
five years ago, hence http://liyang.hu/ is still HTML 4.01. It's time
for change though.

> As for the original issue, I believe there's nothing to fix because
> TiddlyWiki is to be served as HTML, not as XHTML.

Send it as text/html even though its DOCTYPE says it's XHTML? Now I'm
confused!

That's probably besides the point: I guess all I'm saying is, the use
of the non-standard (but nevertheless-implemented-by-most-browsers)
innerHTML is probably a Bad Idea, particularly in light of the fact
that TiddlyWiki claims to be XHTML. It is after all just a shortcut
and you can get the same behaviour with createElement /
createTextNode.

Never mind, I can type a fair few of the (X)HTML entities I want
straight from my keyboard anyway. It's just sometimes I can't remember
(or even – figure them out, when I'm not on my own machine) what the
keystroke for † or ‡ are, say. I might end up just writing a plugin
for the usual LaTeX symbols (e.g. \dagger, \ddagger) I frequently use
and not bother expecting TiddlyWiki to do the Right Thing with (X)HTML
entities.

Cheers,
/Liyang

PS: I'm sure Google Groups will strip out my non-ASCII characters from
the above post. Oh well.

FND

unread,
Jan 6, 2008, 4:01:24 AM1/6/08
to Tiddly...@googlegroups.com
> I'm not too bothered about IE 6. IE 7 seems to render the tiddly as
> XHTML alright.

I'm not sure whether there have been any significant improvements on IE7
with regards to XHTML handling (though I doubt it).
Either way, TiddlyWiki has to care about IE6, since it's still used by a
large number of people (which is unfortunate, of course).

> I know, but isn't it worse sending XHTML out as text/html?

From a pragmatic perspective, no; it's the only way XHTML works
cross-browser.

> the use of the non-standard (but nevertheless-implemented-by-most-browsers)

> innerHTML is probably a Bad Idea [...] It is after all just a shortcut


> and you can get the same behaviour with createElement /
> createTextNode.

I've had the same qualms, and did some thorough research on this issue.
I still think innerHTML is evil, but it's a necessary evil, particularly
in the context of TW (not least because using DOM methods would likely
require a lot more code). I can't go into the details now, but here's a
good (and concise) overview of the issue:
http://www.dustindiaz.com/innerhtml-vs-dom-methods/

> PS: I'm sure Google Groups will strip out my non-ASCII characters from
> the above post. Oh well.

I guess it didn't - yay! :)


-- F.

Jack

unread,
Jan 8, 2008, 3:51:09 AM1/8/08
to TiddlyWikiDev
> As it turns out, XHTML 1.0 only specifies five entities, for <, >, &,
> ', and ". The rest need to be loaded via separate <!ENTITY ...>
> declarations.

Or you can use the numeric equivalents.For &dagger; this would be
&#8225;

http://www.w3.org/TR/REC-html40/sgml/entities.html
Reply all
Reply to author
Forward
0 new messages