XHTML Strict form attribute values - implications for array notations in PHP and Javascript

Kenn White

unread,

Mar 13, 2004, 8:10:01 AM3/13/04

to

In trying to migrate a PHP application to XHTML strict on the client
side, I have discovered several interesting data points. Hopefully the
information below will save others time and grief. Have patience, this
is on-topic, I promise...

One of the useful features of PHP is the ability to pass associative
arrays from forms using bracketed notation.

If I have a form, name="f", and, say, an input text box,
name="user_data[Password]", then in Javascript, to reference it I would
do something like:

var foo = f['user_data[Password]'].value;

(As a quick aside, a very nice feature is that multiple selects can be
used by name='user_data[]' where each chosen value is instantiated
server side as a numerically indexed array, but I digress).

Now, say that in making the switch to XHTML strict, I decide to fully
embrace standards compliance, and change my form to id="f", and the
input text box to id="user_data[Password]"

Because these now have id instead of name, I discover that all my
javascript validation routines just broke. It seems that I have to now
change all my js code to something like:

document.getElementById( 'user_data[Password]' ).focus();

I test this on all the major modern browsers, and I'm thinking, Great!
Until I try to validate said page. It turns out that the bracket
characters are invalid in id attributes. Ack! So I read this thread:

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&th=78dea36fd65d9bbe&seekm=pqx99.19%24006.13377%40news.ca.inter.net#link11

What does this mean, I start asking myself? Do I have to abandon my
goal to migrate to XHTML strict? Transitional seems so unsatisfying.
And why bother with a technique that seems to work on most browsers, if
it's broken. Alas, there is hope.

But then I read http://www.w3.org/TR/xhtml1/#h-4.10 carefully. It says
"name" is deprecated as a form attribute, but *NOT* specifically as an
attribute in form *elements*. It seems my solution is to use "id" for
the form itself, but I can legally use "name" for the individual form
components, such as select and text input boxes. I get the impression
that "name" as an attribute is eventually going away completely, but in
extensive testing using the W3C validator, it passes "name" on form
components, as long as "id" (or, strangely, nothing) is used to denote
the form itself.

So for XHTML strict, the bottom line:
1. form, use id, not name
2. input, use id if you can, but if you need to use bracketed notation
(for example, passing PHP arrays), i.e., foo[], you *MUST* use name for
XHTML strict validation.

Some might argue, that with respect to the W3C recommendations, while
the above advice adheres to the "letter of the law" it violates the
spirit of them, I plead guilty. Then again, I understand that
"eventually" img is being deprecated in favor of object, but I have a
feeling that by the time that move is widely embraced by the browser
market, I should be fine.

Lastly, I am still a bit confused on whether it's "officially" legal
(W3C) for id and name to both be used, and even if it is, whether this
will choke most modern browsers. I welcome your comments.

-kenn

Jukka K. Korpela

unread,

Mar 13, 2004, 8:29:13 AM3/13/04

to

Kenn White <kennwhite....@hotmail.com> wrote:

> In trying to migrate a PHP application to XHTML strict on the
> client side, I have discovered several interesting data points.
> Hopefully the information below will save others time and grief.

To save time and grief, just don't do XHTML.

But assuming that someone uses a weapon (such as money) to force you
into doing XHTML, you can still survive:

> Now, say that in making the switch to XHTML strict, I decide to
> fully embrace standards compliance, and change my form to id="f",
> and the input text box to id="user_data[Password]"

It seems to be a common misunderstanding that XHTML (in some version)
proclaims the name="..." attribute illegal, or at least deprecated. And
there's just enough truth in this to confuse people. The name="..."
attribute _for the <a> element_ (and some other elements) is being
phased out. Never mind the details now, since this does _not_ and
simply _cannot_ apply to form fields.

The only way to make some data from some element get included into a
form data set is to assign a name and a value to the element, and the
only way to assign a name is to use a name="..." attribute.

You can use an id="..." attribute for e.g. an <input> element as well,
but it will _not_ affect the basic mechanism. You would use it for
other purposes, such as associating <label> with <input>, or when
referring to the field in JavaScript code.

> But then I read http://www.w3.org/TR/xhtml1/#h-4.10 carefully. It
> says "name" is deprecated as a form attribute, but *NOT*
> specifically as an attribute in form *elements*.

It is not deprecated at all for form fields.

> in extensive testing using
> the W3C validator, it passes "name" on form components,

Of course.

> as long as
> "id" (or, strangely, nothing) is used to denote the form itself.

A <form> element needs no identifying attribute for its basic job.
If you need to identify it e.g. in client-side scripting, there are
several alternate ways.

> So for XHTML strict, the bottom line:
> 1. form, use id, not name

If you wish to do so. It's the official recommendation in XHTML 1.0,
and causes little harm in most cases.

> 2. input, use id if you can,

No, you need the name="..." attribute in any case (unless you are using
a form for client-side scripting only). You can always use id="..." too
if you like.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Ivo

unread,

Mar 13, 2004, 8:46:03 AM3/13/04

to

> extensive testing using the W3C validator, it passes "name" on form
> components, as long as "id" (or, strangely, nothing) is used to denote
> the form itself.

What is so strange about that? The "name" attribute on forms (now
deprecated) was included late in HTML3 only to allow authors to access that
form via script. Now "id" has completely taken over that role. Never has
there there been any rule which requires any such attribute for forms.

> So for XHTML strict, the bottom line:
> 1. form, use id, not name
> 2. input, use id if you can, but if you need to use bracketed notation
> (for example, passing PHP arrays), i.e., foo[], you *MUST* use name for
> XHTML strict validation.

There is nothing that sais "id" should be favored over "name" in form
elements. In fact, for that element to be included as a name-value pair when
the form is submitted, a name attribute is still very much required. Only
when a label is marked up to focus a specific element, an id is needed.
<label for="samp">Click</label><input id="samp"> works.
<label for="samp">Click</label><input name="samp"> will not work.

> Lastly, I am still a bit confused on whether it's "officially" legal
> (W3C) for id and name to both be used, and even if it is, whether this
> will choke most modern browsers. I welcome your comments.

Yes it is, as long as they have the same value:
<input id="samp" name="samp"> is OK
<input id="samp" name="samp2"> is not
This goes for any element that allows id and name attributes, not just form
elements.

Ivo

Andrew Urquhart

unread,

Mar 13, 2004, 9:11:38 AM3/13/04

to

Ivo wrote:

> Kenn White wrote:
>> Lastly, I am still a bit confused on whether it's "officially" legal
>> (W3C) for id and name to both be used, and even if it is, whether
>> this will choke most modern browsers. I welcome your comments.
>
> Yes it is, as long as they have the same value:
> <input id="samp" name="samp"> is OK
> <input id="samp" name="samp2"> is not
> This goes for any element that allows id and name attributes, not
> just form elements.

Bit puzzled as to how does this fits in with radio button form fields:

<label for="stuff_1">One</label>
<input type="radio" name="stuff" id="stuff_1" value="1" />
<label for="stuff_2">Two</label>
<input type="radio" name="stuff" id="stuff_2" value="2" />
<label for="stuff_3">Three</label>
<input type="radio" name="stuff" id="stuff_3" value="3" />

Best,
--
Andrew Urquhart
- Reply: www.andrewu.co.uk/about/contact/

Alan J. Flavell

unread,

Mar 13, 2004, 9:31:54 AM3/13/04

to

On Sat, 13 Mar 2004, Ivo wrote:

[unattributed quote]

> > Lastly, I am still a bit confused on whether it's "officially" legal
> > (W3C) for id and name to both be used, and even if it is, whether this
> > will choke most modern browsers. I welcome your comments.
>
> Yes it is, as long as they have the same value:
> <input id="samp" name="samp"> is OK
> <input id="samp" name="samp2"> is not

This is rubbish. You seem to be confusing it with the (now
deprecated) use of <a name="foo"> alongside id="foo" to mark anchors
in the document, or to label something that you want to reference from
a script. The purpose of the "name" attribute on form controls is
quite different, and it's by no means deprecated.

> This goes for any element that allows id and name attributes, not
> just form elements.

^^^^^^^^^^^^^

No, it doesn't. Think about radio buttons, for example.

The "form" element itself is not a form control, however.

Jim Cochrane

unread,

Mar 13, 2004, 2:41:47 PM3/13/04

to

In article <Xns94AB9D726D2B...@193.229.0.31>, Jukka K. Korpela wrote:
> Kenn White <kennwhite....@hotmail.com> wrote:
>
>> In trying to migrate a PHP application to XHTML strict on the
>> client side, I have discovered several interesting data points.
>> Hopefully the information below will save others time and grief.
>
> To save time and grief, just don't do XHTML.

What's the reason for this advice - Because currently-used browsers are not
up to handling all of XHTML, or because XHTML is inadequate, or ...?

[Yes, I'm pretty new at this stuff.]

Thanks.
--
Jim Cochrane; j...@dimensional.com
[When responding by email, include the term non-spam in the subject line to
get through my spam filter.]

Darin McGrew

unread,

Mar 13, 2004, 3:07:46 PM3/13/04

to

Jukka K. Korpela wrote:
>> To save time and grief, just don't do XHTML.

Jim Cochrane <j...@shell.dimensional.com> wrote:
> What's the reason for this advice - Because currently-used browsers are not
> up to handling all of XHTML, or because XHTML is inadequate, or ...?

The short version is:

- A commonly used browser-like OS component (not to mention a number of
browsers) doesn't understand XHTML unless you pretend that it is HTML.

- Pretending that XHTML is HTML involves workaround to avoid tripping up
browsers that think it really is HTML.

- Why not just send real HTML in the first place?

There's a long version at http://www.hixie.ch/advocacy/xhtml
--
Darin McGrew, da...@TheRallyeClub.org, http://www.TheRallyeClub.org/
A gimmick car rallye is not a race, but a fun puzzle testing your
ability to follow instructions. Upcoming gimmick car rallye in
Silicon Valley: Clue (Saturday, April 3)

Andy Dingley

unread,

Mar 13, 2004, 5:29:43 PM3/13/04

to

On 13 Mar 2004 12:41:47 -0700, Jim Cochrane
<j...@shell.dimensional.com> wrote:

>> To save time and grief, just don't do XHTML.
>
>What's the reason for this advice

Mainly because Jukka hates XHTML. 8-)

I think XHTML is _wonderful_ stuff and I use it a lot. It's wonderful
for a couple of reasons. Because it's XML, then it's easily parsed and
loaded into a DOM object. With this, I can improve automatic
content-creation processes on the server side. I can also use
client-side features with a DOM to improve usability, especially when
I'm building complex data browsers for interactive client-side
navigation. The code isn't very portable, but then I'm mainly working
on intranets.

Being XML also allows me to use namespacing. Again it's not useful for
much mainstream webbage, but it does have its use.

The downside of XHTML is that it's impossible to use it correctly.
That's it - you just can't. Any real use of it means breaking various
minor aspects of the protocols (read Appendix C of the XHTML TR).

OTOH, no-one cares. Only three people in the world even understand
this problem. Two spend their entire lives on Usenet, the other is
insane and keeps babbling about Schleswig-Holstein. Just ignore
Appendix C, serve it up as text/html and all will be well. Given how
majorly broken most of the web already is, then the "XHTML issue" just
isn't going to keep me awake at night.

XHTML (that XML thing again) is also likely to work better on mobile
phones than HTML does. The proxies that transcode it onto the phone
networks have better reliability for XHTML than SGML-based HTML.

--
Smert' spamionam

Alan J. Flavell

unread,

Mar 13, 2004, 6:23:45 PM3/13/04

to

On Sat, 13 Mar 2004, Andy Dingley wrote:

> >What's the reason for this advice
>
> Mainly because Jukka hates XHTML. 8-)

I think that's unfair. I've seen Jukka recommending a practical
solution in situations where he clearly doesn't actually like it.
Suggesting that he would base technical advice on a mere personal
whim, even in jest, is pretty close to libellous in my book, I must
say.

> I think XHTML is _wonderful_ stuff and I use it a lot. It's wonderful
> for a couple of reasons. Because it's XML, then it's easily parsed and
> loaded into a DOM object. With this, I can improve automatic
> content-creation processes on the server side. I can also use
> client-side features with a DOM to improve usability, especially when
> I'm building complex data browsers for interactive client-side
> navigation. The code isn't very portable, but then I'm mainly working
> on intranets.

I'm sure you're right about all of that, but, being so versatile,
you're not unaware that it can produce HTML as its end-product, right?

> The downside of XHTML is that it's impossible to use it correctly.
> That's it - you just can't. Any real use of it means breaking various
> minor aspects of the protocols (read Appendix C of the XHTML TR).

You're talking about XHTML/1.0 App C, are you? Appendix C of
XHTML/1.1 is something quite different.

> OTOH, no-one cares. Only three people in the world even understand
> this problem. Two spend their entire lives on Usenet, the other is
> insane and keeps babbling about Schleswig-Holstein.

Oh dear. He seemed coherent enough when I used to converse with him.

> Just ignore Appendix C, serve it up as text/html and all will be
> well.

If you ignore all of it, then I'd surmise it won't be very long at all
before all will *not* be well.

> Given how majorly broken most of the web already is, then the "XHTML
> issue" just isn't going to keep me awake at night.

Given how majorly broken the web already is, I don't see any grounds
for applying what might be the last straw. Particularly as XML-based
software could still produce HTML as final output. I'll wait until
XHTML-based stuff brings tangible benefits on XHTML-based client
agents, and meantime, those tag-soup-slurpers that were designed to
get HTML, I'll carry on sending them HTML, I think.

> XHTML (that XML thing again) is also likely to work better on mobile
> phones than HTML does.

Do servers get an Accept for a specific XHTML content-type? Then you
could consider using server-side negotiation.

Jim Cochrane

unread,

Mar 13, 2004, 8:44:01 PM3/13/04

to

In article <Pine.LNX.4.53.04...@ppepc56.ph.gla.ac.uk>, Alan J. Flavell wrote:
> On Sat, 13 Mar 2004, Andy Dingley wrote:
>
>> OTOH, no-one cares. Only three people in the world even understand
>> this problem. Two spend their entire lives on Usenet, the other is
>> insane and keeps babbling about Schleswig-Holstein.
>
> Oh dear. He seemed coherent enough when I used to converse with him.

I wonder if I'll ever be able to find out whom you guys are talking about.
BTW, is Alan one of the two who spend their entire lives on usenet? :-]
(I'm only guessing this because I see he posts here a lot, but perhaps he's
just very efficient with his time.)

>> Just ignore Appendix C, serve it up as text/html and all will be
>> well.
>
> If you ignore all of it, then I'd surmise it won't be very long at all
> before all will *not* be well.
>
>> Given how majorly broken most of the web already is, then the "XHTML
>> issue" just isn't going to keep me awake at night.
>
> Given how majorly broken the web already is, I don't see any grounds
> for applying what might be the last straw. Particularly as XML-based
> software could still produce HTML as final output. I'll wait until
> XHTML-based stuff brings tangible benefits on XHTML-based client
> agents, and meantime, those tag-soup-slurpers that were designed to
> get HTML, I'll carry on sending them HTML, I think.

It sounds like, perhaps, the answer to my Q is something like: While xhtml
is useful in certain situations, it's usefulness as an improved replacement
for html is yet to be determined and won't be determinable for a number of
years.

Jim Cochrane

unread,

Mar 13, 2004, 8:45:07 PM3/13/04

to

In article <c2vpmi$cdi$1...@blue.rahul.net>, Darin McGrew wrote:
> Jukka K. Korpela wrote:
>>> To save time and grief, just don't do XHTML.
>
> Jim Cochrane <j...@shell.dimensional.com> wrote:
>> What's the reason for this advice - Because currently-used browsers are not
>> up to handling all of XHTML, or because XHTML is inadequate, or ...?
>
> The short version is:
>
> - A commonly used browser-like OS component (not to mention a number of
> browsers) doesn't understand XHTML unless you pretend that it is HTML.
>
> - Pretending that XHTML is HTML involves workaround to avoid tripping up
> browsers that think it really is HTML.
>
> - Why not just send real HTML in the first place?
>
> There's a long version at http://www.hixie.ch/advocacy/xhtml

Thanks!

Andy Dingley

unread,

Mar 13, 2004, 8:52:58 PM3/13/04

to

On Sat, 13 Mar 2004 23:23:45 +0000, "Alan J. Flavell"
<fla...@ph.gla.ac.uk> wrote:

>I think that's unfair. I've seen Jukka recommending a practical
>solution in situations where he clearly doesn't actually like it.

Yes, but never to the extent of liking XHTML.

Anyway the poor bugger hasn't seen daylight in six months. We can
forgive an awful lot for cabin fever.

>I'm sure you're right about all of that, but, being so versatile,
>you're not unaware that it can produce HTML as its end-product, right?

Can it ? <xsl:output method="html" /> isn't the only game in town.

>You're talking about XHTML/1.0 App C, are you?

Of course.

>> Just ignore Appendix C, serve it up as text/html and all will be
>> well.
>
>If you ignore all of it, then I'd surmise it won't be very long at all
>before all will *not* be well.

Of course. I'm assuming familiarity with a lot of c.i.w.a.h past
history. The sticky part is that issue over MIME type, and whether we
should serve XHTML as "a better tag soup", or leave it on the shelf
because we still can't as yet do it "right".

>Do servers get an Accept for a specific XHTML content-type?

No. No one ever got an Accept header that meant anything useful. If
you want a laugh, try reading the list of accept headers for brand-new
G3 mobile phones. image/bmp ? Now _that_ is suffering legacy support
beyond the call of duty.

--
Smert' spamionam

Nick Kew

unread,

Mar 14, 2004, 2:00:23 AM3/14/04

to

In article <2i175096ocqq3tmla...@4ax.com>,
Andy Dingley <din...@codesmiths.com> writes:

> I think XHTML is _wonderful_ stuff and I use it a lot.

Jolly good for you. I try to remain agnostic on the subject,
except perhaps when someone advocates that ultimate silliness XHTML1.1.

> It's wonderful
> for a couple of reasons.

... which are shared by HTML ...

> Because it's XML, then it's easily parsed and
> loaded into a DOM object.

HTML is also easily parsed and loaded into a DOM object.

> With this, I can improve automatic
> content-creation processes on the server side.

Indeed, we have a range of useful processors from SAX to XSLT. They
work equally on HTML4 or XHTML.

> I can also use
> client-side features with a DOM to improve usability, especially when
> I'm building complex data browsers for interactive client-side
> navigation.

the DOM is required for any nontrivial clientside scripting, innit?

> The code isn't very portable, but then I'm mainly working
> on intranets.

I thought DOM+script was reasonably portable these days ... though
the documentation-black-hole doesn't help with that.

> Being XML also allows me to use namespacing. Again it's not useful for
> much mainstream webbage, but it does have its use.

Absolutely (are you using mod_xmlns or anything comparable)?
Though if Arjun were still with us I daresay he might explain why
SGML Architectural forms do the job better.

> OTOH, no-one cares. Only three people in the world even understand
> this problem.

Dammit, I must be missing something; I see more than three people in
this thread who seem to me to understand it. Anyone who has read
Hixie on the subject should see it.

> Two spend their entire lives on Usenet, the other is
> insane and keeps babbling about Schleswig-Holstein.

So which category does Hixie fall into? I don't recollect either seeing
him on Usenet or hearing him mention any province of Germany.

> XHTML (that XML thing again) is also likely to work better on mobile
> phones than HTML does. The proxies that transcode it onto the phone
> networks have better reliability for XHTML than SGML-based HTML.

I'm getting an entirely different story - though confidentiality
precludes discussing it here. And as for proxies: well, if some of
them perform poorly on HTML, then mine clearly has a competitive
advantage.

--
Nick Kew

Michael Winter

unread,

Mar 14, 2004, 7:24:11 AM3/14/04

to

On Sat, 13 Mar 2004 08:10:01 -0500, Kenn White
<kennwhite....@hotmail.com> wrote:

[snip]

> If I have a form, name="f", and, say, an input text box,
> name="user_data[Password]", then in Javascript, to reference it I would
> do something like:
>
> var foo = f['user_data[Password]'].value;

[snip]

You mean to say you're writing something like this?

function someFunction() {

var foo = f['user_data[Password]'].value;
}

Well don't. Mozilla-based browsers, amongst others assuredly, don't
support using element names as global identifiers in scripts. You should,
at least, write:

document.formName.elementName

That was just a general suggestion by the way - it won't actually solve
your problem (I'm getting to that).

Your actual solution lies in the use of the forms collection. With it, you
can reference a form whether it uses a name or an id attribute for
identification.

function someFunction() {
var foo = document.forms['f'].elements['user_data[Password]'].value;
}

Your document will now validate. You also avoid using
document.getElementById this way, which isn't supported so universally as
the forms collection.

Mike

--
Michael Winter
M.Wi...@blueyonder.co.invalid (replace ".invalid" with ".uk" to reply)

Bertilo Wennergren

unread,

Mar 14, 2004, 7:51:08 AM3/14/04

to

Andy Dingley:

> The downside of XHTML is that it's impossible to use it correctly.

Impossible? (I'll give you "difficult", but "impossible"?)

> That's it - you just can't. Any real use of it means breaking various
> minor aspects of the protocols (read Appendix C of the XHTML TR).

Please tell me where my use of XHTML breaks anything:

<URL:http://www.bertilow.com>

If you find any errors, I'll be happy to correct them.

--
Bertilo Wennergren <bert...@gmx.net> <http://www.bertilow.com>

Jukka K. Korpela

unread,

Mar 14, 2004, 8:14:26 AM3/14/04

to

Bertilo Wennergren <bert...@gmx.net> wrote:

> Please tell me where my use of XHTML breaks anything:
>
> <URL:http://www.bertilow.com>

The server does not send XHTML but an HTML 4.01 version (naturally as
text/html), despite the fact that my current browser specifies

Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/vnd.ms-powerpoint, application/vnd.ms-excel,
application/msword, */*

Since the browser thus says that every media type is equally
acceptable, we must deduce that your content negotiation uses a
preference for HTML 4.01 over XHTML.

This is good policy in practice, but you're not really using XHTML. Not
from the viewpoint of the vast majority users who don't get an XHTML
version, and not as as matter of principle since you definitely give
preference to HTML - presumably you _only_ give XHTML to browsers that
explicitly specify that they prefer XHTML over HTML.

Bertilo Wennergren

unread,

Mar 14, 2004, 8:50:15 AM3/14/04

to

Jukka K. Korpela:

> Bertilo Wennergren <bert...@gmx.net> wrote:

>> Please tell me where my use of XHTML breaks anything:

>> <URL:http://www.bertilow.com>

> The server does not send XHTML but an HTML 4.01 version (naturally as
> text/html), despite the fact that my current browser specifies

> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
> application/vnd.ms-powerpoint, application/vnd.ms-excel,
> application/msword, */*

Indeed. I ignore "*/*" (actually any use of "*" or "?") when I detect
"application/xhtml+xml" in the Accept header.

> Since the browser thus says that every media type is equally
> acceptable, we must deduce that your content negotiation uses a
> preference for HTML 4.01 over XHTML.

In the final comparison of the q values, I actually favour XHTML. If the
q values are equal, I serve XHTML.

> This is good policy in practice, but you're not really using XHTML.

Visit the page with a browser that explicitly says it likes XHTML, and
you'll get XHTML. That's using XHTML, in my opinion.

> Not
> from the viewpoint of the vast majority users who don't get an XHTML
> version, and not as as matter of principle since you definitely give
> preference to HTML - presumably you _only_ give XHTML to browsers that
> explicitly specify that they prefer XHTML over HTML.

Yes.

I do use XHTML, since some user agents do get it (those that
specifically ask for it). And my use does not break any protocols, as
far as I understand. Or could ignoring "*/*" be an error? I'm not sure
about that.

Andrew Urquhart

unread,

Mar 14, 2004, 8:57:33 AM3/14/04

to

Michael Winter wrote:
<post trimmed/>

> Your actual solution lies in the use of the forms collection. With
> it, you can reference a form whether it uses a name or an id
> attribute for identification.
>
> <form ... id="f">
> ...
> <input ... name="user_data[Password]">
> ...
> </form>
>
> function someFunction() {
> var foo =
> document.forms['f'].elements['user_data[Password]'].value; }
>
> Your document will now validate. You also avoid using
> document.getElementById this way, which isn't supported so
> universally as the forms collection.

More info at http://jibbering.com/faq/#FAQ4_25

Michael Winter

unread,

Mar 14, 2004, 9:15:09 AM3/14/04

to

On Sun, 14 Mar 2004 13:57:33 -0000, Andrew Urquhart <re...@website.in.sig>
wrote:

> Michael Winter wrote:

[discussion of the document.forms collection]

> More info at http://jibbering.com/faq/#FAQ4_25

I forgot it was in the FAQ. Thank you.

Alan J. Flavell

unread,

Mar 14, 2004, 9:15:28 AM3/14/04

to

On Sun, 14 Mar 2004, Bertilo Wennergren wrote:

> > Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
> > application/vnd.ms-powerpoint, application/vnd.ms-excel,
> > application/msword, */*

> Or could ignoring "*/*" be an error? I'm not sure
> about that.

Technically, I'd say yes, it's wrong. Given a range of possible
choices of content format, a technically correct thing to send, in
response to that Accept header, would be MS Word, in preference to
HTML or XHTML. Which is exactly what I found happening on one
MultiViews site that I had set up (Apache 1.3.something).

AFAIUI, wildcard choices should only be considered if no explicit
match can be found, irrespective of your qs values.

However, when that browser-like object (if it's what I think it is)
does a reload, it only sends "Accept: */*", so at that point - to the
surprise of the user - you would send them your highest qs match
instead, which would (you say) be application/xhtml+xml

Alternatively, you might want to exclude non-WWW-conforming clients
from the negotiation mechanism, by whatever means you choose.

I'd say that pragmatically, what you're doing (I mean, what you say
you're doing - I haven't tried provoking your server) is OK, in the
circumstances. Pity about the bigger picture: it would be nice to
pressure the major vendor to follow the rules of the WWW, but I can't
see any likelihood of it happening.

cheers

Bertilo Wennergren

unread,

Mar 14, 2004, 10:35:09 AM3/14/04

to

Alan J. Flavell:

> On Sun, 14 Mar 2004, Bertilo Wennergren wrote:

>> > Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
>> > application/vnd.ms-powerpoint, application/vnd.ms-excel,
>> > application/msword, */*

>> Or could ignoring "*/*" be an error? I'm not sure
>> about that.

> Technically, I'd say yes, it's wrong. Given a range of possible
> choices of content format, a technically correct thing to send, in
> response to that Accept header, would be MS Word, in preference to
> HTML or XHTML. Which is exactly what I found happening on one
> MultiViews site that I had set up (Apache 1.3.something).

> AFAIUI, wildcard choices should only be considered if no explicit
> match can be found, irrespective of your qs values.

The effect of what I'm doing, is that I play it safe, serving HTML 4.01,
except when I get an explicit value for "application/xhtml+xml", in
which case I give a slight preference for XHTML 1.1. That can't really
be bad, can it?

> Alternatively, you might want to exclude non-WWW-conforming clients
> from the negotiation mechanism, by whatever means you choose.

I could do that: Always plaing it safe (serving HTML 4.01) if the user
agent seems to call itself "MSIE". In a way that's what I'm doing, when
I ignore "*/*".

> I'd say that pragmatically, what you're doing (I mean, what you say
> you're doing - I haven't tried provoking your server) is OK, in the
> circumstances. Pity about the bigger picture: it would be nice to
> pressure the major vendor to follow the rules of the WWW,

I sympathize. But getting the content to my users takes preference.

> but I can't see any likelihood of it happening.

--

Kenn White

unread,

Mar 14, 2004, 11:07:25 AM3/14/04

to

Andrew and Michael, thanks! This was exactly what I was looking for,
especially the mention of risk of breaking Mozilla collection (my
primary browser). My overriding goal is to avoid any browser detect
hacks, and degrade gracefully on non-DOM compliant UAs (so far my app
works on every browser I can get my hands on for all the major
platforms). As with many things, one question leads to many more.

Concerning the FAQ you referred to in the post (below), the second link
is broken, and the third returns raw HTML in Mozilla (ironic, given that
the c.l.j FAQ author berates "crap" PHP programming practices; maybe
he/she should focus on getting their page to actually render...).

Anyway, when read in IE, the specific link at
http://jibbering.com/faq/#FAQ4_25 is wrong, I believe, when stating that
bracket characters "are illegal in the standard (x)HTML doctypes".

The third link (http://jscript.dk/faq/php.asp) quotes the W3C guidance
on basic data types (http://www.w3.org/TR/html401/types.html#type-id)
and concludes that bracket characters are illegal as form component
(specifically select) name attribute values.

If this is true, either the W3C validator is missing it (e.g., for form
input text boxes, name='foo[]' passes XHTML strict), or this is now
allowed in XHTML, or perhaps it is only a restriction on the form name
itself, or...?

Thanks!

-kenn

Alan J. Flavell

unread,

Mar 14, 2004, 11:19:19 AM3/14/04

to

On Sun, 14 Mar 2004, Bertilo Wennergren wrote:

> > Technically, I'd say yes, it's wrong.
>

> The effect of what I'm doing, is that I play it safe, serving HTML 4.01,
> except when I get an explicit value for "application/xhtml+xml", in
> which case I give a slight preference for XHTML 1.1. That can't really
> be bad, can it?

I thought I'd made my answers plain enough:

- what you're doing seems to be pragmatically OK,

- technically, to the best of my understanding, it's not correct.

Here's how it *should* work, according to the theory AIUI:

- you assign a slightly lower source quality to your XHTML than to
your HTML - it only needs a tiny factor - 0.99 would suffice

- clients which say they accept */* will then get HTML

- clients which accept XHTML but not HTML (with or without */*) will
get XHTML,

- clients which accept HTML but not XHTML will get HTML,

- clients which want XHTML rather than HTML, but which say they accept
either, will have to express a sufficiently higher preference for
XHTML in order to override your source-quality. Here's Mozilla doing
just that:

HTTP_ACCEPT =
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9, [...]

Here's Opera7.23, by contrast, saying that it prefers HTML over XHTML:

HTTP_ACCEPT = text/html, application/xml;q=0.9, application/xhtml+xml;q=0.9
[...]

- which, even under your present arrangements, presumably means that
you're sending Opera the HTML content-type rather than XHTML, no?

But MSIE is still going to get MS Word for preference, if that content
type is available, because that's what it asks for. If that isn't
what its users want, [... fill in the blanks here ...]

Note however that Apache 2.0 say they've taken some liberties with the
negotiation procedure, in the interests of pragmatically better
results. So - what with Apache being such a dominant web server -
that pretty-much means that there's no longer any motivation on folks
to get it right per the spec, which means I'll be increasingly accused
of being over-fussy when I try to discuss these details. Ho hum.

Bertilo Wennergren

unread,

Mar 14, 2004, 12:28:11 PM3/14/04

to

Alan J. Flavell:

> On Sun, 14 Mar 2004, Bertilo Wennergren wrote:

>> > Technically, I'd say yes, it's wrong.

>> The effect of what I'm doing, is that I play it safe, serving HTML 4.01,
>> except when I get an explicit value for "application/xhtml+xml", in
>> which case I give a slight preference for XHTML 1.1. That can't really
>> be bad, can it?

> Here's how it *should* work, according to the theory AIUI:

> - you assign a slightly lower source quality to your XHTML than to
> your HTML - it only needs a tiny factor - 0.99 would suffice

> - clients which say they accept */* will then get HTML

> [...]

> Note however that Apache 2.0 say they've taken some liberties with the
> negotiation procedure, in the interests of pragmatically better
> results.

You're talking about how to do this with Apche content negotiation
("assign a slightly lower source quality to your XHTML"). Actually I'm
doing this in PHP, with my own home grown code. In that code I've added
an exception for user agents calling themselves "Netscape 6". They will
get HTML whatever they say about "application/xhtml+xml", because I know
they can't really handle XHTML. As far as I know, I couldn't have such
exceptions if I were to use the Apache content negotiation.

Anyway, that means there is no XHTML source quality factor to set for me.

Andrew Urquhart

unread,

Mar 14, 2004, 1:03:48 PM3/14/04

to

Kenn White wrote:

> Andrew Urquhart wrote:
>> More info at http://jibbering.com/faq/#FAQ4_25

> Concerning the FAQ you referred to in the post (below), the second
> link is broken, and the third returns raw HTML in Mozilla (ironic,
> given that the c.l.j FAQ author berates "crap" PHP programming
> practices; maybe he/she should focus on getting their page to
> actually render...).

Hi Kenn, I'd suggest contacting the FAQ maintainer (Jim Ley) directly
(see http://jibbering.com/faq/#FAQ5_2) and the maintainer of
http://jscript.dk/faq/php.asp (Thor Larholm, email address in his blog -
http://blog.jscript.dk/) although the warning at http://jscript.dk/
might have something to do with it. Alternatively a post in
comp.lang.javascript with <FAQENTRY> tags around it will get Jim and
others attention.

> Anyway, when read in IE, the specific link at
> http://jibbering.com/faq/#FAQ4_25 is wrong, I believe, when stating
> that bracket characters "are illegal in the standard (x)HTML
> doctypes".
>
> The third link (http://jscript.dk/faq/php.asp) quotes the W3C guidance
> on basic data types (http://www.w3.org/TR/html401/types.html#type-id)
> and concludes that bracket characters are illegal as form component
> (specifically select) name attribute values.
>
> If this is true, either the W3C validator is missing it (e.g., for
> form input text boxes, name='foo[]' passes XHTML strict), or this is
> now allowed in XHTML, or perhaps it is only a restriction on the form
> name itself, or...?

In the context of an HTML 4.01 input element the "name" attribute value
is CDATA, in XHTML1 it's ... http://www.w3.org/TR/xhtml1/#C_8. Ergo I'll
leave it to someone more knowledgeable on this subject to provide the
clear and unambiguos answer :-D

Jim Ley

unread,

Mar 14, 2004, 1:13:51 PM3/14/04

to

On Sun, 14 Mar 2004 11:07:25 -0500, Kenn White
<kennwhite....@hotmail.com> wrote:

>Concerning the FAQ you referred to in the post (below), the second link
>is broken,

Thanks for the heads up, unfortunately Netscape are rather incompetent
in not keeping urls constant, and tracking urls for a FAQ is a
difficult job, we do have an automated process to ensure that links
are valid when we do updates, but even that's not perfect (because
sites often do 200 redirects to "page not found sites"

>and the third returns raw HTML in Mozilla (ironic, given that
>the c.l.j FAQ author berates "crap" PHP programming practices; maybe
>he/she should focus on getting their page to actually render...).

I don't believe I've ever described PHP as crap, I certainly don't in
the CLJ faq., and don't recall doing so other times, I'm sure there
are times though that I've suggested certain PHP practices are crap -
I've said the same about most languages, it's the sort of thing I do.

>Anyway, when read in IE, the specific link at
>http://jibbering.com/faq/#FAQ4_25 is wrong, I believe, when stating that
>bracket characters "are illegal in the standard (x)HTML doctypes".

it's not wrong in claiming that bracket characters are illegal in ID,
it is wrong in claiming it's disallowed in NAME. That's a known issue
which is already fixed, as you may or may not be aware, changing a FAQ
is something which takes time, we need to ensure that the group feels
it is accurate, and the links are relevant etc. hopefully it will be
updated soon.

Jim.
--
comp.lang.javascript FAQ - http://jibbering.com/faq/

Kenn White

unread,

Mar 14, 2004, 1:39:24 PM3/14/04

to

Jim Ley wrote:

[snip...]

> Thanks for the heads up, unfortunately Netscape are rather incompetent
> in not keeping urls constant, and tracking urls for a FAQ is a
> difficult job, we do have an automated process to ensure that links
> are valid when we do updates, but even that's not perfect (because
> sites often do 200 redirects to "page not found sites"

Hi Jim. Didn't mean to come across as critical. It's just frustrating
when one is seeking very specific questions to very specific technical
problems on usenet, and kneejerk advice to "read the FAQ" leads to the
woes described above. No, in case folks don't tell you guys enough,
thank you, thank you, thank you for maintaining these documents. Really.

[snip...]

> I don't believe I've ever described PHP as crap, I certainly don't in
> the CLJ faq., and don't recall doing so other times, I'm sure there
> are times though that I've suggested certain PHP practices are crap -
> I've said the same about most languages, it's the sort of thing I do.

Oops. That was misattributed -- my mistake. I was referring to comments
made in a *link* to which the c.l.j FAQ points
(http://jscript.dk/faq/php.asp), which says of using bracketed array
notation by "newbie" PHP programs, that "this naming methodology is
crap". I'm neither a newbie or, more to the point for this group, *not*
using a "crap" technique, but rather what I believe to be a valid XHTML
Strict- and DOM-compliant programmatic reference. I was chuckling at
the author berating such "crap" techniques in php web authoring, because
the very page in which he makes these statements renders as raw HTML
source code in Mozilla (given that it is an .asp page, one could only
assume he never bothered to view it in non-IE browsers).

-kenn

Alan J. Flavell

unread,

Mar 14, 2004, 1:14:58 PM3/14/04

to

On Sun, 14 Mar 2004, Bertilo Wennergren wrote:

> You're talking about how to do this with Apche content negotiation

Well, yes and no. I'm talking about how to do this with the
RFC-defined procedure, as best I can understand it (see RFC2616
section 14.1 etc.) - and that's what's implemented in Apache 1.3.x.
(Apache 2.0, as I say, announces that it takes some liberties).

> ("assign a slightly lower source quality to your XHTML"). Actually I'm
> doing this in PHP, with my own home grown code. In that code I've added
> an exception for user agents calling themselves "Netscape 6".

But then if you're to be protocol-correct you'd have to add a
corresponding Vary: header saying that the resource is dependent on
the user agent string, which makes the pages effectively uncacheable.

Whereas a "Vary: Accept" could still be cacheable whenever the same
Accept string is seen - which is much more common than seeing the
exact same user agent string!

> They will get HTML whatever they say about "application/xhtml+xml",
> because I know they can't really handle XHTML.

What Accept string do they actually send, do you know? I.e do they
actually _say_ they prefer XHTML, even though they don't?

> As far as I know, I couldn't have such
> exceptions if I were to use the Apache content negotiation.

Haven't thought that through, but I doubt that the negotiation
algorithm itself would do that for you. Possibly want to factor-out
that case in some other way (mod_rewrite?), if one were otherwise
using Apache's own negotiation.

If you were negotiating on dimensions of content-type, charset,
language, and so on, I suspect you'd be glad of having Apache doing it
for you... But if PHP turns you on, that's fine too ;-))

Bertilo Wennergren

unread,

Mar 14, 2004, 2:25:30 PM3/14/04

to

Alan J. Flavell:

> On Sun, 14 Mar 2004, Bertilo Wennergren wrote:

>> You're talking about how to do this with Apche content negotiation

> Well, yes and no. I'm talking about how to do this with the
> RFC-defined procedure, as best I can understand it (see RFC2616
> section 14.1 etc.) - and that's what's implemented in Apache 1.3.x.
> (Apache 2.0, as I say, announces that it takes some liberties).

I assumed that the part about giving a certain source quality to XHTML
was something that's a part of actual tweaking of the Apache
negotiation. Maybe I misunderstood.

>> ("assign a slightly lower source quality to your XHTML"). Actually I'm
>> doing this in PHP, with my own home grown code. In that code I've added
>> an exception for user agents calling themselves "Netscape 6".

> But then if you're to be protocol-correct you'd have to add a
> corresponding Vary: header saying that the resource is dependent on
> the user agent string, which makes the pages effectively uncacheable.

Interesting. That is all new to me. I've never heard about a Vary
header. (It's always nice to learn something new.) Do you have a nice
link about that?

>> They will get HTML whatever they say about "application/xhtml+xml",
>> because I know they can't really handle XHTML.

> What Accept string do they actually send, do you know? I.e do they
> actually _say_ they prefer XHTML, even though they don't?

I think Netscape 6 sends the same Accept header as Mozilla:

Accept: text/xml, application/xml, application/xhtml+xml,
text/html;q=0.9, image/png, image/jpeg, image/gif;q=0.2,
text/plain;q=0.8, text/css, */*;q=0.1

That would mean that it says it prefers XHTML over HTML. So the
exception seems to be needed.

>> As far as I know, I couldn't have such
>> exceptions if I were to use the Apache content negotiation.

> Haven't thought that through, but I doubt that the negotiation
> algorithm itself would do that for you. Possibly want to factor-out
> that case in some other way (mod_rewrite?), if one were otherwise
> using Apache's own negotiation.

> If you were negotiating on dimensions of content-type, charset,
> language, and so on, I suspect you'd be glad of having Apache doing it
> for you... But if PHP turns you on, that's fine too ;-))

It's actually an attractive idea to leave all that to Apache, not having
to litter my complicated PHP with content negotiation.

I have however one more special thing I do: There's a switch that you
can add to the query string ("_html" or "_xml") to override all other
factors in the content negotiation: "_html" gives HTML 4.01, "_xml"
gives XHTML 1.1. While I wouldn't mind ditching the exception for
Netscape 6 (a very minor user agent), I'm not to keen on dropping that
switch. It's actually needed in order to validate the pages e.g. with
the W3C validator.

Richard Cornford

unread,

Mar 14, 2004, 2:22:47 PM3/14/04

to

Kenn White wrote:
> Andrew and Michael, thanks! This was exactly what I was looking for,
> especially the mention of risk of breaking Mozilla collection (my
> primary browser).

This is actually one of the points where the silliness of serving XHTML
as text/html shows itself. The javascript property accessor:-

document.forms['f'].elements['user_data[Password]'].value

- is the most cross-browser form available for a form with the ID "f".
Because it confirms with the W3C HTML DOM specification and the
"convenience" properties it defines on the HTMLDocument interface that
formalises some pre-existing browser behaviour. It won't work with
Netscape 4 because that browser does not make IDed form elements
available as named members of the document.forms collection, but IE 4
and all later scriptable HTML UAs are fine with it.

But the W3C HTML DOM specification lists numerous differences between
HTML implementations and XHTML implementations; namespaces, case
sensitivity and so on. There is even some argument as to how much of the
W3C HTML DOM specification should be included in XHTML implementations,
which has lead the Mozilla developers to not implement the HTMLDocument
interface on the document object in their XHTML implementation. No
HTMLDocumnet interface means no "convenience" properties and suddenly
the most cross-browser form property accessor doesn't work. There is no
alternative but use document.getElementById in its place.

That is Mozilla's XHTML implementation of the DOM, but if you send XHTML
as text/html Mozilla treats it is HTML, error corrects it back to HTML
and uses its HTML implementation of the DOM, with the "convenience"
properties, etc.

So from the point of view of wanting to script a DOM, are you trying to
script an HTML implementation (in which case why use anything but HTML),
or are you trying to script an XHTML implementation that may include
Mozilla's (in which case you need to write script appropriate for that
task and forget about back compatibility with scriptable HTML UAs), or
are you going to attempt to write a script that can cope with both types
of DOM (lots of extra work)?

For the time being it appears that sending XHTML as text/html will
result in it being treated as HTML and an HTML implementation of the W3C
HTML DOM being used (where scripting is possible). But does it really
make sense to be authoring one type of document and scripting another?

<snip>

> Anyway, when read in IE, the specific link at
> http://jibbering.com/faq/#FAQ4_25 is wrong, I believe, when stating
> that bracket characters "are illegal in the standard (x)HTML
> doctypes".

<snip>

The next revision of the comp.lang.javascript FAQ will correct that
error. Form control NAME attributes are CDATA type and largely
unrestricted in the characters that may be used.

Richard.

Alan J. Flavell

unread,

Mar 14, 2004, 3:05:53 PM3/14/04

to

On Sun, 14 Mar 2004, Bertilo Wennergren wrote:

> > corresponding Vary: header saying that the resource is dependent on
> > the user agent string, which makes the pages effectively uncacheable.
>
> Interesting. That is all new to me. I've never heard about a Vary
> header.

RFC2616 section 14.44, and its use is discussed in 13.6

> (It's always nice to learn something new.) Do you have a nice
> link about that?

Well, my knee-jerk reaction was to say http://www.mnot.net/cache_docs/
- but I find, slightly to my surprise, that he doesn't mention that
specific detail.

That cacheing tutorial still comes well-recommended, though.

It's an HTTP/1.1 feature which informs the client and - more
importantly - any cache proxies en route, as to which factors were
used in the negotiation, and thus help any co-operative cache proxies
to know whether they can re-issue the cached object without a fresh
negotiation.

> I think Netscape 6 sends the same Accept header as Mozilla:
>
> Accept: text/xml, application/xml, application/xhtml+xml,
> text/html;q=0.9, image/png, image/jpeg, image/gif;q=0.2,
> text/plain;q=0.8, text/css, */*;q=0.1
>
> That would mean that it says it prefers XHTML over HTML. So the
> exception seems to be needed.

D'accord.

> I have however one more special thing I do: There's a switch that you
> can add to the query string ("_html" or "_xml") to override all other
> factors in the content negotiation: "_html" gives HTML 4.01, "_xml"
> gives XHTML 1.1.

Well, putting aside your exception rules for a moment, Apache
MultiViews does what you're describing here, quite straightforwardly.

> While I wouldn't mind ditching the exception for
> Netscape 6 (a very minor user agent), I'm not to keen on dropping that
> switch.

It's really no problem with MultiViews.

I don't have too many practical examples to show; the ones that I do
have are based on language, not content-type, but the principle's the
same at the server. Try this:

http://ppewww.ph.gla.ac.uk/~flavell/charset/quick

Then try changing your language preference to Greek, and reload..

If you know how to retrieve the HTTP headers, you see this:

HTTP/1.1 200 OK
Date: Sun, 14 Mar 2004 19:41:26 GMT
Server: Apache/1.3.26 (Unix) PHP/4.2.2
Content-Location: quick.en.html
Vary: negotiate,accept-language,accept-charset
...etc.

, for the English page (and .el. for the Greek page).

Now try directly accessing quick.en or quick.en.html etc.

In effect, MultiViews supplies the missing qualifiers (filename
extensions), based on the dimensions of negotiation that have been
enabled in the server. In this case it's language (en or el), and
charset (iso-8859-7 or windows-1253 for the Greek page).

[Note - that page goes back to around 1997-8 , whereas the
English version has been worked-over to some extent since.]

Some further notes about language negotiation at:
http://ppewww.ph.gla.ac.uk/~flavell/www/lang-neg.html

- some of which are also relevant to negotiating content-type.

> It's actually needed in order to validate the pages e.g. with
> the W3C validator.

Sure thing. You can see how MultiViews would deal with that.

Alan J. Flavell

unread,

Mar 14, 2004, 5:42:27 PM3/14/04

to

On Sun, 14 Mar 2004, Alan J. Flavell wrote:

[stuff...]

I really meant to change the subject header, sorry.

Also, google suggests we should see http://squid.sourceforge.net/vary/
which seems like a good idea.

It found http://www.msxnet.org/fast-website too

Bertilo Wennergren

unread,

Mar 14, 2004, 6:36:43 PM3/14/04

to

Alan J. Flavell:

> On Sun, 14 Mar 2004, Bertilo Wennergren wrote:

>> I have however one more special thing I do: There's a switch that you
>> can add to the query string ("_html" or "_xml") to override all other
>> factors in the content negotiation: "_html" gives HTML 4.01, "_xml"
>> gives XHTML 1.1.

> Well, putting aside your exception rules for a moment, Apache
> MultiViews does what you're describing here, quite straightforwardly.

Very interesting. I really should look into that. It could help me clean
up my PHP code. That is, if Apache MultiViews and content negotiation
can be combined with the pages still being processed with PHP... I do a
lot of other stuff with PHP, that I'm almost sure can't be duplicated
with built in Apache tricks.