Now, I'm pretty picky about my markup, and I'm certainly willing to go
to unusual lengths to get it just the way I want it, but it'd be
awfully nice if there were some way to get HTML-style output from
newforms without having to manually subclass all the widgets and
override their rendering to remove trailing slashes.
Unfortunately, I don't really have a good proposal for how to handle
this, except maybe to further break down the Widget API to include
'as_html' and 'as_xhtml'. Any ideas?
--
"May the forces of evil become confused on the way to your house."
-- George Carlin
+1 to this proposal. I found myself writing the code below, which is
quite scary but does the trick:
[[[
from django import template
"""
Remove XHTML endings from tags to make them HTML 4.01 compliant
Usage:
{% load html4 %}
{% html4 %}
My long template with {{ variables }} and
{% block whatever %} blocks {% endblock %}
{% endhtml4 %}
"""
def do_html4(parser, token):
nodelist = parser.parse(('endhtml4',))
parser.delete_first_token()
return Html4Node(nodelist)
class Html4Node(template.Node):
def __init__(self, nodelist):
self.nodelist = nodelist
def render(self, context):
output = self.nodelist.render(context)
return output.replace(' />', '>')
register = template.Library()
register.tag('html4', do_html4)
]]]
Cheers.
--
Antonio
The question is where to stop. Pickiness may lead further to having an
option to omit quotes around attribute values, have uppercase tag names,
omit end tags of <li> etc... This is all working HTML (even valid by DTD).
<rant mode=purist>
Since all these things happily work in browsers the only difference
between "/>" and the rest is that it is not DTD-valid HTML 4.01. However
to my puristic point of view this is not a problem because DTD
validation is effectively useless. The only user agent that does DTD
validation is W3C's validator itself. No real browser ever considered
HTML as SGML application and never used DTD for its validation. In fact
what guys at WHAT WG[1] are doing now for HTML5 is specifying exactly
the syntax that browsers use for parsing HTML. And "/>" will be valid HTML5.
So from my point of view a real HTML purist would ignore HTML 4.01
validation altogether. :-)
</rant>
In fact I'm wrong here... I just checked that W3C's validator doesn't
object to "<br />"s. This is a valid HTML 4.01:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<title>Test</title>
<p><br />
So even "invalidness" is not a point. What's then?
If XHTML-style tags are valid in HTML 4 strict, then I don't see a
point in creating a separate output format for each widget. If you
want to be religious about whether there's a slash in your HTML tag,
clearly you care about it enough to have the (minimal) energy to write
a custom method on your Form. Or, write a custom Form subclass once
and subclass it for each form you use.
Adrian
--
Adrian Holovaty
holovaty.com | djangoproject.com
I had the same thought but I wrote a quick test with an HTML 4 strict
doc-type, put an input tag in it like this: <input type="text"
name="mytext" />, and it was still valid.
I'd like to see agnostic HTML from Django too, but if it isn't
producing anything that breaks HTML 4, I'm happy.
One of my worries is that the W3C, from what I've read, is going to
re-invigorate HTML and work on HTML5 as well as XHTML2. If these two
diverge enough, Django may need support for both.
Somewhere I think I suggested the possibility of passing in a template
string to the forms on how they should render, but the only purpose of
this would be to remove the "/", and requires knowledge of the local
variables. Last I looked the forms stuff is pretty flexible with
adding attributes (ie: class names) and things. So there's not a lot
of benefit doing that.
-Rob
> Unfortunately, I don't really have a good proposal for how to handle
> this, except maybe to further break down the Widget API to include
> 'as_html' and 'as_xhtml'. Any ideas?
build up the output using a light-weight DOM with a nice Python-level
syntax, and serialize it on the way out, using either a standard XHTML
serializer, or a user-provided alternative serializer.
(should I duck now?)
</F>
They are valid but have a completely different meaning which browsers
don't interpret correctly; in HTML4, the closing slash is a form of
SGML SHORTTAG syntax, and '<br />' in HTML4 is meant to be interpreted
as a 'br' element followed by a literal greater-than sign.
WHAT-WG's HTML5 will do away with this and make the closing slash
semantically meaningless in HTML, but that's still a ways off in the
future.
> They are valid but have a completely different meaning which
(most)
> browsers don't interpret correctly; in HTML4, the closing slash is a
> form of SGML SHORTTAG syntax, and '<br />' in HTML4 is meant to be
> interpreted as a 'br' element followed by a literal greater-than sign.
full details:
http://www.cs.tut.fi/~jkorpela/html/empty.html
</F>
Also, there is a valid problem here; if I produce HTML4, and just say
"it validates, I don't care if it's correct", then my HTML will work
in browsers, but actual SGML parsers (which do exist and do get used
on occasion) will produce a different document tree when they parse my
HTML, because they'll read the SHORTTAG syntax correctly. I haven't
yet had a moment to verify that the standard Python sgmllib will do
this, but I know for a fact that nsgmls will.
Also, we're the framework for "perfectionists"; let's get this right ;)
Yup. And in fact, I do that (quite deliberately).
But I'm not asking for that; mostly that's a matter of templating and
doing some deep hacking in markdown.py (at least in my case), and I'm
willing to put in that work.
I'm just asking for a simple way to get form inputs without trailing
slashes, because even though it's a nitpicky thing and maybe there
aren't very many people who actually care about the difference between
XML empty-element syntax and SGML SHORTTAG, there are potential issues
with things like SGML parsers, and (above all) it's just not "right"
:)
On IRC a moment ago, Jacob suggested an 'html4' template filter which
would just strip trailing slashes from empty tags; I'd be happy with
that (and willing to put in some time to implement it), provided we
advertise clearly that the Django forms system is going to produce
XHTML and that if you want HTML4 you'll need to use the filter.
> Since all these things happily work in browsers the only difference
> between "/>" and the rest is that it is not DTD-valid HTML 4.01. However
> to my puristic point of view this is not a problem because DTD
> validation is effectively useless. The only user agent that does DTD
> validation is W3C's validator itself. No real browser ever considered
> HTML as SGML application and never used DTD for its validation.
Steps to reproduce:
1. Put together an HTML4 document with valid DOCTYPE declaration and everything.
2. Drop a closing slash into a BR, IMG or INPUT element somewhere.
3. Run through nsgmls and look at how it gets parsed
There's a real-world difference there. You may say that nobody's ever
used a real SGML parser on HTML4, but I actually have (in fact, I once
ran into a situation where it was the only way to find a bug that the
standard W3C validator settings couldn't catch), and I know for a fact
that you get different output from an SGML parser than you do from a
web browser. That's an interoperability problem :)
> In fact
> what guys at WHAT WG[1] are doing now for HTML5 is specifying exactly
> the syntax that browsers use for parsing HTML. And "/>" will be valid HTML5.
Yup. I'm subscribed to their mailing list and I've been following the
discussion of that, and the proposal to allow 'xmlns' in HTML, with
some trepidation. These things are necessary now because for years
people have been doing stuff that was demonstrably incorrect but still
"worked". I don't want Django to become part of that crowd.
I posted a tag earlier in this thread that does what you ask (it's
pretty trivial). Are my messages coming through, anyway?
--
Antonio
> There's a real-world difference there. You may say that nobody's ever
> used a real SGML parser on HTML4, but I actually have (in fact, I once
> ran into a situation where it was the only way to find a bug that the
> standard W3C validator settings couldn't catch), and I know for a fact
> that you get different output from an SGML parser than you do from a
> web browser. That's an interoperability problem :)
the Planet RSS aggregator used to use an SGML parser (sgmllib?) to
clean up embedded HTML, which caused rather interesting output when
people used crappy blog tools that inserted <br /> all over the
place.
</F>
Nice to meet a like-minded person :-)
> I'm just asking for a simple way to get form inputs without trailing
> slashes
As you said the problem is how to make it simple enough... What about a
middleware that seeing 'text/html' content type would htmlize content?
> There's a real-world difference there. You may say that nobody's ever
> used a real SGML parser on HTML4, but I actually have
In the near future I think The Right Thing would be to use a real HTML
parser for such things. There were many messages on WHATWG list from
people writing such tools in many languages including Python:
http://code.google.com/p/html5lib/
Well... define "near future" ;)
Whenever HTML5-the-specification is finished and
HTML5-the-cross-browser-implementation is available, then yeah,
that'll work. In the meantime, HTML4 with SGML tools is all I've got
available to me, and every once in a while that catches things the W3C
validator's default settings won't notice.
When the library will be usable.
> Whenever HTML5-the-specification is finished and
> HTML5-the-cross-browser-implementation is available, then yeah,
> that'll work.
You don't have to wait for this because html5lib would work with
existing content which is pretty much the point of the whole spec.
That might be valid in the SGML sense, but it means something
competely different. See http://hixie.ch/advocacy/xhtml
> One of my worries is that the W3C, from what I've read, is going to
> re-invigorate HTML and work on HTML5 as well as XHTML2. If these two
> diverge enough, Django may need support for both.
Well, I suppose that depends on how many browser implementations XHTML2
will get...
Well, it's reflects what's been implemented for years in browsers.
Yes, but it's still not right ;)
When HTML5 finalizes, then I'll feel a little more comfortable
migrating to it and this won't be an issue anymore. For now, though,
I'm using HTML 4.01 (quite happily, I might add) and running into
annoyance with the fact that both Django's newforms and the old
manipulator system default to XHTML-style tags with no way to override
that.
I may have to just resort to that 'html4' template tag posted further
up in the thread...
James,
I'm in the same boat. I'm curious why you don't use XHTML?
For me, it's some of the exact reasons that Ian Hickson states, but I
was curious about others.
http://www.hixie.ch/advocacy/xhtml
-Rob
In no particular order:
1. I've done the content-negotiation thing before, and I don't really
want to go there again.
2. I don't have need of any XML-specific features, so I don't really
have a valid reason to dump something that's been working remarkably
well up until now.
3. HTML 4.01 lets me be more terse by omitting various tags and other
bits, which appeals to my minimalist side.
4. I just feel like being ornery sometimes.
> 4. I just feel like being ornery sometimes.
Don't let him fool you -- this is actually reason #1 :)
Jacob
Then I don't understand why you still insist on some artificial DTD
validity that doesn't matter anything to anyone except the tool that
checks for it. Ignoring "/>" IS a reality. Whether HTML5 approves it
next year or in ten years won't change a thing...
For me, it's the difference between specified behaviour and an
accidental implementation detail. Maybe future clients go into a
kind of quirks mode when they see " />" and output a warning that
"this is not proper HTML4 and unsafe to consume"? And they would
even be right.
BTW, what keeps me from XHTML is simply that my javascript
library of choice (yui) doesn't support it in all components. I
feel that django leads one into a trap by rendering XHTML style,
and it's not a good idea. XHTML currently has a tendency to break
your web site when it grows.
Michael
--
noris network AG - Deutschherrnstraße 15-19 - D-90429 Nürnberg -
Tel +49-911-9352-0 - Fax +49-911-9352-100
http://www.noris.de - The IT-Outsourcing Company
Hmmm. Can you elaborate? We're using YUI for a few things as well and
I wasn't aware of this. (We can take this offline if it's preferable.)
> XHTML currently has a tendency to break your web site when it grows.
I'd also like to hear more about this one too.
Rob Hudson schrieb:
> Michael Radziej wrote:
>> BTW, what keeps me from XHTML is simply that my javascript
>> library of choice (yui) doesn't support it in all components.
>
> Hmmm. Can you elaborate? We're using YUI for a few things as well and
> I wasn't aware of this. (We can take this offline if it's preferable.)
Menubars don't work in documents delivered as content type
application/xhtml+xml. There's already an entry in the bug database.
>
>> XHTML currently has a tendency to break your web site when it grows.
>
> I'd also like to hear more about this one too.
Well, XHTML not by itself, of course. It's just that JavaScript and CSS
work differently with XHTML or HTML4. But since some well known crap
browsers don't work with XHTML, you are forced to deliver the content
either as application/xhtml+xml or text/html, based upon content type
negotiation.
And when your business buys or has some CSS, the probability is high
that the CSS designer has never heard about XHTML ...
Michael
I still think that all these "strategies" do not belong in the BaseForm
class. Using a strategy pattern, something like I've suggested before
[1], gives you the same flexibility of subclassing (and maybe more),
yet makes it easier to change existing code to use a new strategy (I
wouldn't have to go changing the classes my Forms inherit).
FormFormatter is to Form as Widget is to Field. We aren't adding
as_CheckboxSelectMultiple-like methods to the Field class, so why are
we doing it with Form?
If we want to be able dynamically specify how your form gets rendered
in the template instead of in the code, then maybe we could introduce a
templatetag:
{% form myform as_xhtml %}
which might be a nice idea for Fields too:
{% field myform.myfield CheckboxSelectMultiple %}
Anyway, this could also probably help the people wanting to grab their
Widget HTML from templates instead of code [2].
[1]
http://groups-beta.google.com/group/django-developers/browse_thread/thread/e3bcd07da81c3275/b13b6385d1b6696e#msg_b13b6385d1b6696e
[2]
http://groups-beta.google.com/group/django-developers/browse_thread/thread/b2ace4f7f69a73f6/8412579768316e9c
XHTML has become the defacto standard for Django, which is great, but
the vast majority of pages are still HTML 4. So if there's to be one
standard it should be that.
Each widget and form could have a template that could be overrided at
project level (like admin templates, for example), and in template
level, we can simply pass the name of the template we want to use as a
template filter, something like:
{{ form|template:"as_table.html" }} or {{
form[name]|template:"text_input.html" }}
This give all user the option to create or modify the presentation of
their forms fields and a way to refactore and slim the code ;). This
also allow the admins and web developers to create custom forms fields
presentation (AJAX, custom options, etc) without the need of
programming knowledge.
With this, default could be XHTML, but if someone want to create the
same set of templates in HTML he can do it easily, but it can also do t
in XUL or the format he want.
Only my thoughts ;)