Outline:
* What I know about HTML4 and Django
* Some info about past efforts and discussions
* Thoughts and curiosities about what we can do
What I know about HTML4 and Django
============================
First, let's not let this turn into an argument about HTML vs XHTML.
People have their preference one way or the other, and I believe
Django should be agnostic.
Currently, however, Django *is* biased to producing XHTML output in
the various places that output HTML, such as comments, formtools,
csrf, forms.widgets, and various filters and HTML utilities that
output "<br />" tags. For someone that prefers an HTML4 doctype, this
is a hassle with no easy answer.
Some info about past efforts and discussions
=================================
Some tickets already open on this...
Ticket #6925: CSRF html output is not valid html (it is xhtml)
http://code.djangoproject.com/ticket/6925
Ticket #7281: Add doctype tag to webdesign template tags
http://code.djangoproject.com/ticket/7281
Ticket #7452: Settings for HTML4 or XHTML output
http://code.djangoproject.com/ticket/7452
Proposal from Mar 2008:
Form rendering with filters...
http://groups.google.com/group/django-developers/browse_thread/thread/5f3694b8a19fb9a1
Proposal from Sept 2008:
{% doctype %} and {% field %} tag for outputting form widgets as HTML
or XHTML...
http://groups.google.com/group/django-developers/browse_thread/thread/f04aed2bc60274f0
Since those tickets aren't closed as "wontfix" or "invalid", and much
of the conversation surrounding this agrees that *something* should be
done, I'm hoping to start a conversation as to what that something
might be.
Thoughts and curiosities about what we can do
==================================
After thinking for some time, I've only been able to come up with two ideas...
1. Incorporate something along the lines of Simon's django-html
project (http://github.com/simonw/django-html).
I think the genera idea is this: You set a doctype in your base
template via a template tag (e.g. {% doctype "html4" %}, and any
rendering that is done after that (in inherited or included templates)
is based on this doctype.
Advantages:
* It puts the doctype decision into the hands of the designer.
* It allows for different parts of an application to have different doctypes.
Shortcomings:
* Fixes the symptom, not the bug. [1] I think the fix should not do
string replacement, but output the correct HTML in the first place.
(I realize this is the best one can hope for as a 3rd party app, but
if something were to come into core it would have the ability to fix
things properly.)
* Doesn't "fix" the rest of the areas where Django outputs XHTML. Is
it possible?
* Ties various parts of Django to the-thing-that-produces-HTML.
2. Add a new setting, e.g. settings.DOCTYPE, that various parts of
Django and 3rd party apps use to render HTML.
Advantages:
* Simple and straightforward
Shortcomings:
* Yet another setting
* Doesn't allow for template level decisions of doctype by designers
Since I think idea #1 has the best chance of making headway, I've
tried to look at how it might be done. Unfortunately, I don't know
the template system well enough to see how setting {% doctype "html4"
%} might be able to affect other areas of Django. Here was my thought
process the other night...
If Django widgets and other parts of Django always used the Template
system to render HTML, it might be possible to set some global context
variable of the current doctype. But perhaps the reason for not doing
this initially is because each `get_template` call results in a read
from the filesystem (or thread safety?)? In which case, we should
consider fixing #6262, Cache templates. Using Templates everywhere
does inhibit re-use of pieces of Django outside of Django as they
would all now rely on Django's templates.
There's also the option of something like Werkzeug's HTMLBuilder[2]
and their use in Solace widgets[3].
I don't claim to have the answer, but I'm willing to put time and
effort into helping solve this. Thoughts? Other ideas?
Thanks,
Rob
[1] http://www.pointy-stick.com/blog/2007/09/04/fix-bug-not-symptom/
[2] http://dev.pocoo.org/projects/werkzeug/browser/werkzeug/utils.py#L126
[3] http://bitbucket.org/plurk/solace/src/tip/solace/utils/forms.py#cl-294
I don't think either of those conclusions can be drawn from the facts.
XHTML 2 was a bold attempt to redefine what it is exactly that web pages
produce: less crud, more content. I'm, personally, a bit disappointed
that it has been "canceled", but I assume that smart people will
continue to explore concepts in that direction.
XHTML 2's cancellation does not say that XHTML 1 was a bad idea or is in
anyway invalidated and it absolutely says nothing about XML. (Also,
whether or not XML is the "future" of the web, it is a strong part of
what defines the "present".)
Furthermore, support for XHTML "5" (which is indeed a part of the HTML 5
standard) shows that XHTML 1's principles are still around and still
respected. Django's XHTML output can't be "out of date" if XHTML 5 is
considered a successor to XHTML 1.
--
--Max Battcher--
http://worldmaker.net
This is the bit I was hoping we could get to. What would that
refactoring look like, you think? Would it involve making forms use
Templates? Or something else?
> I think the key problem is template filters, which often produce XHTML
> (linebreaksbr for example). This could be solved by allowing template
> filters selective access to the template context, which I'm in favour
> of in the absence of a compelling argument against. This could be done
> in a way that allows existing filters to continue to work - maybe
> something as simple as this:
>
> register.filter('cut', cut, takes_context=True)
>
> Which emulates the register.inclusion_tag API.
We've got django.forms and tags/filters on the plate. How might we
fix something like comments which has templates which write out XHTML
(or other little places which hand code HTML)?
For example, if one includes contrib.comments in their HTML4 website,
it doesn't validate...
http://validator.w3.org/check?uri=http%3A%2F%2Fjacobian.org%2Fwriting%2Fsnakes-on-the-web%2F&charset=%28detect+automatically%29&doctype=Inline&group=0
Ahem. Sorry Jacob!
I'd really hate to see something like this everywhere:
<input type="hidden" name="next" value="{{ next }}"{% if
doctype.xhtml %} /{% endif %}>
> Really not keen on that - designers should be able to pick their
> doctype, and there are some cases where specific pages of a site (or
> of a reusable app) might need to use a specific doctype - MathML
> output still requires XHTML for example.
I agree.
Simon mentioned how HTML and XHTML could result in Javascript and DOM
differences -- should we be concerned about being too dynamic about
swapping out HTML for XHTML, and vice versa, for fear of breaking
Javascript and CSS that rely on them?
For example, I think the Django admin is fine as XHTML since it isn't
intended to be an included part of your website. But say we were able
to easily set a different doctype that rendered as HTML4. Does that
have the possibility of really breaking things?
> I've somewhat foolishly volunteered for a bunch of 1.2 related hacking
> already (CSRF, logging, signing) but I remain actively interested in
> this, so I'll try to keep an eye on things and see if there's anything
> I can contribute.
If we can come up with something that everyone is happy with, I'm in
for volunteering time to help code it. I'm sure it will be a learning
experience for me as far as Django internals, but I'm willing to put
the effort into it.
Thanks,
Rob
This is idea that I think has a lot of merit. Django has always had a
policy of being client-side agnostic - the XHTML-specific output
format has always been the chink in that armor. The impending
significance of HTML5 increases the incentive to have this debate (and
more importantly, to get it right).
If this requires changes to the way forms and fields are rendered, I'm
happy to entertain those ideas, as long as they retain backwards
compatibility.
However, as with the signed cookie discussion, I don't have any
particularly strong opinions on the exact form that the API should
take, so I will stay out of the discussion and let the community
evolve the idea.
By way of greasing the wheels towards trunk: if the outcome of this
mailing list thread was a wiki page that digested all the ideas,
concerns and issues into a single page, it will make the final
approval process much easier. Luke Plant's wiki page on the proposed
CSRF changes [1] is a good model to follow here - I wasn't involved in
the early stages of that discussion, but thanks to that wiki page, I
was able to come up to speed very quickly and see why certain ideas
were rejected.
[1] http://code.djangoproject.com/wiki/CsrfProtection
Yours,
Russ Magee %-)
But I'm not sure that is a correct assertion. I don't think there is a
strong enough difference between HTML and the XHTML that Django
generates for there to be a need for more complicated mechanisms of
rendering. If the only point of contention is self-closing tags (<tag
/>), then HTML 4 might not support it explicitly (although in fact most
modern browsers support it implicitly), but HTML 5 (the HTML form
factor, not just XHTML 5) explicitly supports self-closing tags:
http://dev.w3.org/html5/spec/Overview.html#start-tags
Under Section 9.1.2.1 -- Start Tags, I quote:
6. Then, if the element is one of the void elements, or if the
element is a foreign element, then there may be a single U+002F SOLIDUS
(/) character. This character has no effect on void elements, but on
foreign elements it marks the start tag as self-closing.
I really don't see what the fuss here is about. If we are worried about
forwards-compatibility, HTML 5 takes care of it. If we are worried about
better backwards-compatibility with HTML 4, everyone else is saying that
the future is now and the focus should be HTML 5...
What is this argument really about?
For lack of knowing about anything better, I keep falling back to
Werkzeug's HTMLBuilder class[1]. Pulled out and stripped of comments,
it weighs in at 77 lines of code...
Here's a brief Python shell of how it works...
>>> html = HTMLBuilder('html')
>>> html.input(type='text', name='blah', value='"Quote & Ampersand"')
u'<input type="text" name="blah" value=""Quote & Ampersand"">'
>>> html.select(name='template', id='id_template', *[html.option(v, value=k) for k, v in dict({1: 'One', 2: 'Two', 3: 'Three'}).iteritems()])
u'<select id="id_template" name="template"><option
value="1">One</option><option value="2">Two</option><option
value="3">Three</option></select>'
I really like how it handles children nicely, as in the select/option
example above.
>>> xhtml = HTMLBuilder('xhtml') # XHTML dialect
>>> xhtml.input(type='text', name='blah', value='"Quote & Ampersand"')
u'<input type="text" name="blah" value=""Quote & Ampersand"" />'
This automatic CDATA escaping in XHTML is also nice:
>>> html.script('var id=document.getElementById("id")')
u'<script>var id=document.getElementById("id")</script>'
>>> xhtml.script('var id=document.getElementById("id")')
u'<script>/*<![CDATA[*/var id=document.getElementById("id")/*]]>*/</script>'
I could see using something like this and making a template tag
wrapper around it like the namespace template tag you mention below.
I *think* that would simplify a lot of what you see in the Django
template widget render code that deals with attributes (e.g.
buildattrs, flattatt), which should make writing widgets and
sublcassing widgets a bit easier.
> The form.as_p stuff works as either HTML or XHTML. It would be nice to
> further emphasise the ease with which people can create their own
> reusable form templates (define them as an includable fragment that
> iterates over the form), and it would be nice if there were more
> finely grained methods for things like accessing the HTML ID of a form
> field. I don't see any reason to templatise those parts in particular
> though - unless someone has smart ideas about how baked in default
> templates could dramatically improve the overall form experience.
I agree with what I think I'm reading here -- a goal being to give
designers more fine grained control over the form and form elements at
the template level.
> That's tricky. There are really only a few tags that actually differ -
> anything that needs to be self closing, which means the following:
>
> <area />
> <base />
> <basefont />
> <br />
> <hr />
> <input />
> <img />
> <link />
> <meta />
Also:
<col />
<frame />
<param />
> Of these, only meta, link, img, input and br are really common. I can
> think of a few ways of dealing with this, none of them particularly
> enticing:
>
> 1. a {% selfclose %} template tag:
>
> <br{% selfclose %}>
>
> {% selfclose %} outputs either blank or " /" depending on the doctype.
>
> 2. a {% tag %} tag:
>
> {% tag br %}
>
> Like the {% field %} tag, this could take optional attributes:
>
> {% tag br class="break" %}
>
> 3. {% field %} style tags for all of the self-closing XHTML tags:
>
> {% br %} {% br class="break" %}
> {% hr %}
> {% meta name="dc:author" value="Simon" %}
>
> This option really sucks - that's 9 new template tags polluting our
> default template namespace which do almost nothing. That said, if we
> added template tag namespacing we could at least do {% tag.br %}, {%
> tag.hr %} etc.
>
> They're all pretty horrible, but I think out of those I prefer option
> 1 (maybe with a better, shorter name).
I think you might want both 1 and 3. (1) for those that want finer
control or just don't want to use the underlying HTML wrapper, and (3)
for those that do.
Would it be something to consider adding special case tag, like
comments, to represent the self closing slash depending on current
context's doctype? For example, something like {% / %}?
[1] http://dev.pocoo.org/projects/werkzeug/browser/werkzeug/utils.py#L126
I feel like I'm starting to get a picture in mind for all the pieces
at play here.
Thanks,
Rob
I don't mind trying to piece a Wiki page together documenting the
current conversation. I agree it will make a good pointer for
reference and future discussions.
-Rob