The patch provided in #7806[1] splits __init__.py into separate
modules[2] and introduces a TokenStream class that allows parsing of
literals, vars (called lookups) and filter expressions.
Here's how it would work:
@register.tag
@uses_token_stream
def mytag(parser, bits):
expr = bits.parse_expression(required=True)
return MyNode(expr)
`uses_token_stream` replaces the Token argument to the parser
function with a TokenStream object.
If the token is not fully parsed, a TemplateSyntaxError is raised.
For better examples, have a look at the patched version of
`django.template.defaulttags`.
TokenStream API (first stab)
============================
``def __init__(self, parser, source)``
parser: a django.template.compiler.Parser instance
source: a string or a django.template.compiler.Token instance
TokenStream tokenizes its source into "string_literal",
"numeric_literal", "char", and "name" tokens, stored in self.tokens
as (type, lexeme) pairs. Whitespace will be discarded.
You can "read" tokens via the following methods, the current position
in self.tokens is maintained in self.offset.
A "char" token is a single character in ':|=,;<>!?%&@"\'/()[]{}`*+-'.
A name is any sequence of characters that does contain neither "char"
characters nor whitespace and is not a string or numeric literal.
>>> TokenStream(parser, r'"quoted\" string"|filter:3.14 as
name').tokens
[('string_literal', '"quoted\\" string"'), ('char', '|'), ('name',
'filter'), ('char', ':'), ('numeric_literal', '3.14'), ('name',
'as'), ('name', 'name')]
>>> TokenStream(parser, r' "quoted\" string" | filter : 3.14 as
name').tokens
[('string_literal', '"quoted\\" string"'), ('char', '|'), ('name',
'filter'), ('char', ':'), ('numeric_literal', '3.14'), ('name',
'as'), ('name', 'name')]
Low level API
-------------
``def pop(self)``
Returns a pair (tokentype, lexeme) and offset+=1; tokentype is one of
"string_literal", "numeric_literal", "char", "name".
``def pop_lexem(self, match)``
Returns True and offset+=1 if the next token's lexeme equals `match`.
Returns False otherwise.
``def pop_name(self)``
Returns the next token's lexeme and offset+=1 if its tokentype is
"name". Returns None otherwise.
``def pushback(self)``
offset-=1
High level API
--------------
These methods raise TokenSyntaxError and leave offset untouched if
the expected result cannot be parsed.
Each accepts a boolean required=False kwarg which turns
TokenSyntaxErrors into TemplateSyntaxErrors if True.
``def parse_string(self, bare=False)``
Returns the value of the following string literal. If bare=True,
unquoted strings will be accepted.
``def parse_int(self)``
Returns the value of the following numeric literal, if it is an int.
``def parse_value(self)``
Returns an Expression that evaluates the following literal, variable
or translated value.
``def parse_expression(self)``
Returns an Expression that evaluates the following value or
filterexpression.
``def parse_expression_list(self, minimum=0, maximum=None, count=None)``
Returns a list `e` of expressions; minimum <= len(e) <= maximum.
count=n is a shortcut for minimum=maximum=n.
I'm unhappy with the naming of TokenStream and uses_token_stream
(using_token_stream?). And I just noticed the english spelling of
lexeme has a trailing "e".
[1] http://code.djangoproject.com/ticket/7806
[2] See [1] for details. Yes, this is orthogonal to the concern of
the ticket. I gave up splitting the patch when I stumbled upon the
first circular dependency issue - a half-assed effort - and decided
it's not worth it.
As was pointed out the first time you brought this up, keep in mind that
there still need to be ways to manually control the lexing phase. Not
every template tag has the same requirements there.
Also, since the Variable class is part of the public API (we document
how to write custom template tags), you need to preserve backwards
compatibility. For example, anything written according to
custom-template-tags.txt should continue to work unchanged. There are
possibly some reasonable other methods in template/__init__.py that
should continue to remain available as well.
That being said, I'm completely in favour of a bit of shuffling in those
files to make things easier to understand and fix a bunch of little
inconsistencies. However, I'd encourage doing it in slightly smaller
steps than the original patches I read so that we can see that backwards
compatibility is maintained and do the smallest number of changes
necessary to tidy things up. Basically, keep the refactoring and the
feature/API addition portions distinct.
For example, it's useful (and necessary) to make the Variable class be a
lot more consistent with respect to also resolving the effect of apply
filters to variables (right now, anything using Variable doesn't have
filter effects applied, which was probably just an oversight
originally).
Helper methods that make it easy for the common cases to say "I expect
three paramters matching the 'x y as z' pattern where x and z are
strings and y must be an integer" would be very handy. This is where
method like your parse_int and parse_string methods are additions that
would be useful. Things like parse_expression is probably overkill for a
high-level API, since resolving the effect of filters should really just
be a natural part of Variable.resolve_context(). I'm not sure I see the
use-case at the moment for parse_expression; perhaps you could elaborate
with an example.
Regards,
Malcolm
> As was pointed out the first time you brought this up, keep in mind
> that
> there still need to be ways to manually control the lexing phase. Not
> every template tag has the same requirements there.
>
> Also, since the Variable class is part of the public API (we document
> how to write custom template tags), you need to preserve backwards
> compatibility. For example, anything written according to
> custom-template-tags.txt should continue to work unchanged. There are
> possibly some reasonable other methods in template/__init__.py that
> should continue to remain available as well.
It is completely optional. If you don't use @uses_token_stream you'll
get a Token instance that can be parsed as hitherto. And afaics the
old API is still there and functional.
> That being said, I'm completely in favour of a bit of shuffling in
> those
> files to make things easier to understand and fix a bunch of little
> inconsistencies. However, I'd encourage doing it in slightly smaller
> steps than the original patches I read so that we can see that
> backwards
> compatibility is maintained and do the smallest number of changes
> necessary to tidy things up. Basically, keep the refactoring and the
> feature/API addition portions distinct.
Basically, agreed.
> For example, it's useful (and necessary) to make the Variable class
> be a
> lot more consistent with respect to also resolving the effect of apply
> filters to variables (right now, anything using Variable doesn't have
> filter effects applied, which was probably just an oversight
> originally).
> Helper methods that make it easy for the common cases to say "I expect
> three paramters matching the 'x y as z' pattern where x and z are
> strings and y must be an integer" would be very handy. This is where
> method like your parse_int and parse_string methods are additions that
> would be useful. Things like parse_expression is probably overkill
> for a
> high-level API, since resolving the effect of filters should really
> just
> be a natural part of Variable.resolve_context(). I'm not sure I see
> the
> use-case at the moment for parse_expression; perhaps you could
> elaborate
> with an example.
parse_expression() does not resolve filters. It returns an Expression
(Lookup, Literal, or FilterExpression) instance that has a resolve
(context) method.
Lookup handles looking up variables in a context. FilterExpression
groups an Expression and (func, arg) pairs. Literal boxes literal
values.
# Here's how your example could be parsed with the proposed API:
x = bits.parse_string(required=True)
y = bits.parse_int(required=True)
if not bits.pop_lexeme('as'):
raise TemplateSyntaxError
z = bits.parse_string(required=True)
# A {% with %} parser example:
def do_with(parser, bits):
expr = bits.parse_expression(required=True)
if not bits.pop_lexeme('as'): # this should be more compact
raise TemplateSyntaxError
name = bits.pop_name()
if not name:
raise TemplateSyntaxError
nodelist = parser.parse_nodelist(('endwith',))
return WithNode(expr, name, nodelist)
do_with = register.tag('with', uses_token_stream(do_with))
Yes, that would work.
Yes, that would work.
> Would @register.tag(token_stream=True) work instead, or am I missing
> something?
Um .. welll. :-(
Parts of it are very well thought out and if it had been a post on "how
Jinja works" it would have been excellent. Other parts are completely
unconstrained by facts or acknowledgement that Django's standard
templates and Jinja's ones have different goals (which is important,
since this isn't an apples to apples comparison at all). So that the
flip side is on the record, here's my single contribution to it:
The good news is that, as far as I can see, all the actual bugs he notes
are already in tickets. I went through and checked quickly yesterday and
found all but one, which I'm pretty sure is in there, so I'll look again
today.
Whilst reading that, keep in mind that Armin apparently misunderstands
(or possibly just omits to clearly state) what a template object
instance is in Django. It's a state machine for rendering templates and
includes the current state as well as the transitions. This isn't an
invalid or particularly novel approach to executing a domain specific
language. In fact, it makes a lot of sense if you want to write
self-contained node-based state instances. Jinja and Django templates
have different compilation strategies and return different types of
objects and the difference is a vital distinction in any discussion of
appropriate usage. The blog post appears to imply that it would be
natural for any random data structure to be used as a global variable
(by extrapolation, since he doesn't establish that a Template instance
-- not a class, the instance of a compiled template -- is special in any
sense, so it must be somehow "normal") and that those structures will
work beautifully when multiple threads cause its internal state to be
modified.
This has two problems: (1) it's pretty much false by default (Python !=
Haskell, Erlang, etc and mutable objects are very normal in Python), and
(2) it would imply that using lots of globals to store stuff in
multi-threaded programs is a good idea that people should be doing more
often.
There's a name for that sort of programming style and it's not
"Excellent, Maintainable Code". :-)
Using the same logic, Python is not thread-safe at all and shouldn't be
used, since lists, dictionaries, sets, and pretty much any random
instance of a class object contain state in the instance that will
change upon use. Tuples and strings are specially noted to be immutable
and so are usable in those situations without extra locking, but they're
special-cases, not the norm, in a pass-by-reference language. Using
module-level globals sparingly is something we certainly encourage.
Fortunately, this isn't a real issue, since there's nowhere that we
suggest the assumption of immutability would be valid (and, really, a
decent programmer is going to know that any arbitrary data structures
have the same behaviour unless it's noted as immutable; minimal
experience says that making assumptions contrary to that always leads to
trouble). You create the template object as part of a view and render it
and things are (mostly -- see next para) fine. There's no real need to
put template instances into globals.
The "mostly" bit is because we do have some reset-ability issues with
things like the cycle tag -- so rendering it multiple times in the same
thread of execution, such as creating mass mail, will lead to differing
results.
Also, so that it's on the record: Armin denigrating Eric Holscher for
ticket #7501 was a low-blow for multiple reasons. One being that Armin
either misunderstood or ignored the type of object that a Template
instance is, as noted above. Another being that nominally avoiding
mentioning specifics whilst giving enough information that somebody
could easily find it out anyway is simply sleazy. It devalues Eric's
work for that part of the audience who don't know the history and makes
the author of the piece just look small. To fill in the missing pieces,
I explicitly asked Eric to change the title at the Lawrence Sprint, as I
was busy doing something else at the time and had been meaning to change
it for a while and Eric was going through doing the scut work of making
the changes through the Web browser interface whilst Jacob and I fed him
decisions on various points. I made this change in full awareness of the
difference between multi-threaded and multi-run use-cases and knowing
that Template instances are just like lists, dictionaries,
sets, ...(insert 150 other data structures here).
All the bits in that blog post about Jinja, once understood in the
context of why Jinja is a different templating approach to Django, were
well done in that article. Reading the Jinja source is a pretty
interesting thing to do, too and people should definitely play with the
language if they haven't already. But it's not a one-for-one replacement
for Django templates, nor is it a superset of functionality in either
direction. It has different goals and neither Jinja nor Djangos' goals
are invaildated by the other existing.
The bugs noted in Django templates are things we've noted down and
Johannes changes address a large number of them. We've always accepted
speed improvements that don't completely break things, but we also
realise that template rendering is just one component of the timeslice
for a request and isn't the major piece unless your net time is pretty
tiny anyway. With sensible caching strategy and real-world data and
computation stuff, the amortised effect is that it's "fast enough", for
most cases and making it possible to easily enough use a different
templating language when it's not. That's a pretty good goal. Django
ships with a template language for designers, not for Python
programmers; another large difference between Django and Jinja
templates, leading to the somewhat-perceived dichotomy over "logic in
templates" (although in a lot of cases the big arguments people have
come down to which template tags and filters should ship by default).
It's always nice when somebody sits down and write out an holistic state
of play of a particular portion of anything. It's very encouraging when,
reading through, you realise that there aren't any really big surprises
in the factual areas and it helps clarify where the differences of
opinion are. Sanity checks of that nature cannot be undervalued. They
can be written in a less confrontational style, however.
Regards,
Malcolm
http://docs.djangoproject.com/en/dev/misc/design-philosophies/#template-system
In particular:
"""
The goal is not to invent a programming language. The goal is to offer
just enough programming-esque functionality, such as branching and
looping, that is essential for making presentation-related decisions.
The Django template system recognizes that templates are most often
written by designers, not programmers, and therefore should not assume
Python knowledge.
"""
Jacob