Removing HTML formatting from Compojure - good idea?

158 views
Skip to first unread message

James Reeves

unread,
Jan 8, 2009, 6:32:02 PM1/8/09
to Compojure
I've been thinking about removing the formatting from the html
function in compojure.html. Currently Compojure indents block tags
like <p> and <div>, and renders tags like <h1> and <li> on their own
line. This provides a fairly nice layout.

However, it also interferes with the <pre> tag, adding in extra
indentation that messes up the content on screen, and I can't think of
any neat way to fix this. It also increases the size of HTML documents
by a small percentage.

In an age of Firebug and IE's developer toolbar, I'm wondering whether
trying to give the output HTML a nice format is still relevant. Any
web developer will look through the source using a DOM viewer, and any
casual user won't be looking at the source code.

Does anyone have any thoughts on this? Should Compojure output HTML
without any indentation?

- James

Mark Ryall

unread,
Jan 8, 2009, 6:39:35 PM1/8/09
to comp...@googlegroups.com
Seems like a sensible idea.  Lots of editors allow you to easily reformat if you really needed to look at the source outside a browser.

If you're doing lots of javascript DOM manipulation, viewing the source is pretty useless regardless of how nicely indented it is.

Phil Hagelberg

unread,
Jan 8, 2009, 6:53:02 PM1/8/09
to comp...@googlegroups.com
James Reeves <weave...@googlemail.com> writes:

> Does anyone have any thoughts on this? Should Compojure output HTML
> without any indentation?

I like the indentation. I think reading HTML output in the REPL or in
test results (not to mention curl) is a common enough use case that it's
worth spending a bit of extra effort to keep it readable.

It's easy enough to turn off by using xml instead of html. But I haven't
been bitten by any problems such as with pre. Is it infeasible to
special-case pre?

-Phil

James Reeves

unread,
Jan 8, 2009, 7:32:05 PM1/8/09
to Compojure
On Jan 8, 11:53 pm, Phil Hagelberg <p...@hagelb.org> wrote:
> I like the indentation. I think reading HTML output in the REPL or in
> test results (not to mention curl) is a common enough use case that it's
> worth spending a bit of extra effort to keep it readable.

The problem is that it's turning into a significant effort to maintain
three different formatters: raw xml, indented xml, and formatted html.
Removing formatting means I can cut all the formatting code that's
giving me trouble.

It would also open up the possibility of using Mark McGranaghan's clj-
html library, which should be faster than Compojure's current
implementation.

Readable output is useful. However, Java's XML libraries can parse and
output indented XML, so it wouldn't be too difficult to whip up an XML
pretty-printer function for this scenario. If I added an ppxml
function to the standard libraries, would this be an acceptable
substitute?

> It's easy enough to turn off by using xml instead of html. But I haven't
> been bitten by any problems such as with pre. Is it infeasible to
> special-case pre?

pre is a special case already, in that it is rendered without any
additional indentation. The problem is that when pre is rendered
inside a block element like div, the indentation from div affects the
contents of pre.

In order to get around this, I need to add an indentation argument to
each formatter, so that each element would render its own indentation,
rather than being indented by its parent. This would allow pre to
ignore the indentation level.

However, it seems a rather cluttered solution and enough extra work
that it has raised the question whether it's worth continuing to spend
time on it. The time I spend trying to get HTML neatly formatted is
time I'm starting to think could be better spend elsewhere.

- James

Phil Hagelberg

unread,
Jan 8, 2009, 7:54:20 PM1/8/09
to comp...@googlegroups.com
James Reeves <weave...@googlemail.com> writes:

> The problem is that it's turning into a significant effort to maintain
> three different formatters: raw xml, indented xml, and formatted html.
> Removing formatting means I can cut all the formatting code that's
> giving me trouble.
>
> It would also open up the possibility of using Mark McGranaghan's clj-
> html library, which should be faster than Compojure's current
> implementation.

OK, I buy that. I like more maintainable code. =)

> Readable output is useful. However, Java's XML libraries can parse and
> output indented XML, so it wouldn't be too difficult to whip up an XML
> pretty-printer function for this scenario. If I added an ppxml
> function to the standard libraries, would this be an acceptable
> substitute?

Yeah, sure. We would need output from higher-level tests to use it where
appropriate, but that's not hard. Speaking of higher-level tests, the
example app is a little short on them; is there anywhere else we could
find guidelines on compojure tests, or is this another one of those "the
trail is yours to blaze" situations?

> In order to get around this, I need to add an indentation argument to
> each formatter, so that each element would render its own indentation,
> rather than being indented by its parent. This would allow pre to
> ignore the indentation level.

Could you use a dynamic variable to do this without passing an argument
around all over the place? If not, I'd vote for dropping it in favour of
xmlpp.

-Phil

Mark McGranaghan

unread,
Jan 8, 2009, 8:14:09 PM1/8/09
to comp...@googlegroups.com
I think that dropping the indented output makes sense; Firebug has
been all I've needed to make sense of HTML output. It also makes
maintaining the HTML library much easier (I know because I tried to
write a pretty-printing 'interpreter' to complement clj-html's
'compiler' but dropped it because it didn't seem worth the cost) and
reduces the odds of subtle bugs like <pre> rendering.

Dropping the indentation also makes a dual compiler/interpreter system
more viable, in which one could switch between the flexibility of an
'interpreting' HTML library like Compojure's and the speed of a
'compiling' library like clj-html . The markup code given to these two
libraries are very similar* so using two libraries like this in the
same project could make sense. I don't think that purely-interpreted
view templates are viable on high-traffic sites, so its nice to know
that there is at least the possibility of using a more performant
library in critical templates. Likewise, I'm not sure that I want to
be limited to compiled templates for all my view code, especially
things like form helpers where interpreted templates using lots of
helpers might be more convenient.

Sorry if I strayed a bit from the original question of indentation,
but I thought this was a good time to share these thoughts.

- Mark

* Compilation introduces a few restrictions on the input of the html
macro in clj-html. Two that I know of:
1. Attribute maps must be literal, ie [:div {:id "foo" :class "bar"}]
is OK but [:div (merge {:id "foo"} {:class "bar"})] is not
2. (html must be used inside calls to helper functions, i.e. helper
functions return HTML strings not nested vectors.

dcbcrafts

unread,
Jan 8, 2009, 8:41:32 PM1/8/09
to Compojure
I've been bitten by this with textarea form input also. I figure I
can come up with some way around it, but would prefer to have some way
to disable the indentation.

James Reeves

unread,
Jan 9, 2009, 4:37:06 AM1/9/09
to Compojure
On Jan 9, 12:54 am, Phil Hagelberg <p...@hagelb.org> wrote:
> Yeah, sure. We would need output from higher-level tests to use it where
> appropriate, but that's not hard. Speaking of higher-level tests, the
> example app is a little short on them; is there anywhere else we could
> find guidelines on compojure tests, or is this another one of those "the
> trail is yours to blaze" situations?

Higher-level tests? Do you mean unit testing?

> Could you use a dynamic variable to do this without passing an argument
> around all over the place? If not, I'd vote for dropping it in favour of
> xmlpp.

I'm not sure what you mean by a dynamic variable, I'm afraid, unless
you mean putting it in a top-level var. The html function recurses
through a tree of data. The indentation for each node will be
different, and depends on the indentation level of its ancestors. If
you didn't pass the indentation layer as an argument, you'd have to
increment it when you went down a level, and decrement it when you
left. You'd also have to guarantee the evaluation took place
sequentially. I don't think that's any better than passing an
argument, myself.

- James

James Reeves

unread,
Jan 9, 2009, 4:53:36 AM1/9/09
to Compojure
On Jan 9, 1:14 am, "Mark McGranaghan" <mmcgr...@gmail.com> wrote:
> I think that dropping the indented output makes sense; Firebug has
> been all I've needed to make sense of HTML output. It also makes
> maintaining the HTML library much easier (I know because I tried to
> write a pretty-printing 'interpreter' to complement clj-html's
> 'compiler' but dropped it because it didn't seem worth the cost) and
> reduces the odds of subtle bugs like <pre> rendering.
>
> Dropping the indentation also makes a dual compiler/interpreter system
> more viable, in which one could switch between the flexibility of an
> 'interpreting' HTML library like Compojure's and the speed of a
> 'compiling' library like clj-html . The markup code given to these two
> libraries are very similar* so using two libraries like this in the
> same project could make sense.I don't think that purely-interpreted
> view templates are viable on high-traffic sites, so its nice to know
> that there is at least the possibility of using a more performant
> library in critical templates. Likewise, I'm not sure that I want to
> be limited to compiled templates for all my view code, especially
> things like form helpers where interpreted templates using lots of
> helpers might be more convenient.

You're right on that score, I think. In hindsight it seems obvious,
but at the time I didn't figure that compiling templates had a couple
of disadvantages. But if clj-html is a drop-in replacement for
compojure.html, then users get the best of both worlds.

On that score, I notice you used functions like map-str, whilst mine
were called str-map. Is there any particular reason for the change? I
map-str reads in English order "map, then str", and in composition
order "map . str", which may be why you preferred it, whist I was
thinking of "(str (map ...))" when I wrote str-map.

The reason I ask is that if we used the same notation, it would be
easier to drop clj-html into Compojure. I'm not really too fussed
about which way round it is, but I was curious as to whether you had
any strong reason for the change.

> Sorry if I strayed a bit from the original question of indentation,
> but I thought this was a good time to share these thoughts.

Not at all; I'm glad you did.

- James

Mark McGranaghan

unread,
Jan 9, 2009, 10:45:25 AM1/9/09
to comp...@googlegroups.com
> You're right on that score, I think. In hindsight it seems obvious,
> but at the time I didn't figure that compiling templates had a couple
> of disadvantages. But if clj-html is a drop-in replacement for
> compojure.html, then users get the best of both worlds.

I've just finished a fairly extensive writeup analyzing clj-html.core
and compojure.html. I try to compare their syntax, understand their
potential for interoperability, and quantify the performance tradeoff
between the two:

http://gist.github.com/45136

If you're using either of these html libraries I strongly encourage
you to check this out.

> On that score, I notice you used functions like map-str, whilst mine
> were called str-map. Is there any particular reason for the change?

Some reasons:
* I wanted domap-str and map-str to be consistent with each other and
Clojure's do* convention, so str-map would imply dostr-map which
seemed akward.
* I was thinking about mapcat, where the outermost operator is on the
right of the function name (but perhaps there are cases in Clojure
core where it is the opposite?)
* In general I like the -str suffix for functions that return or
operate and strings, and -map for those that return or operate on
maps.

I'm still on the fence about clj-html.core's helper functions - I may
rename them or move them out of core. Also, as I suggest in the Gist,
I may need to add some other helpers elsewhere.

Thanks for your thoughts,
- Mark

Phil Hagelberg

unread,
Jan 9, 2009, 12:36:09 PM1/9/09
to comp...@googlegroups.com
James Reeves <weave...@googlemail.com> writes:

> On Jan 9, 12:54 am, Phil Hagelberg <p...@hagelb.org> wrote:
>> Yeah, sure. We would need output from higher-level tests to use it where
>> appropriate, but that's not hard. Speaking of higher-level tests, the
>> example app is a little short on them; is there anywhere else we could
>> find guidelines on compojure tests, or is this another one of those "the
>> trail is yours to blaze" situations?
>
> Higher-level tests? Do you mean unit testing?

I guess I'm used to having my test suite split up into lower level unit
tests and then integration tests that are about how the pieces fit
together. But because Compojure views are just functions, (rather than
in most Ruby systems where they're separate files in another language)
you don't need any special tools to test them.

Still, it would be nice to have test-helper functions that can do things
like set up fake requests and make it easy to fake sessions. You can
just def those in your tests of course, but I wouldn't know what kinds
of things they would need to contain to really be a viable replacement
for the jetty ones. Functions to test that a given HTML rendering
contains what you're expecting would be great too, but that would
require implementing a jQuery-style CSS-selector engine, which is
probably a fair amount of work.

>> Could you use a dynamic variable to do this without passing an argument
>> around all over the place? If not, I'd vote for dropping it in favour of
>> xmlpp.
>
> I'm not sure what you mean by a dynamic variable, I'm afraid, unless
> you mean putting it in a top-level var.

I was thinking you could use binding; that way the indentation level
could be based on the previous value, but it would automatically
decrement when you return from a function. You could bind it to zero
within <pre>s. But maybe that's not feasible in this case; I haven't
looked at it closely.

In any case, converting it to pretty-printed after the fact is fine for
me. That way you don't incur the overhead of all these calculations
unless you really need it.

-Phil

James Reeves

unread,
Jan 9, 2009, 3:11:24 PM1/9/09
to Compojure
On Jan 9, 5:36 pm, Phil Hagelberg <p...@hagelb.org> wrote:
> James Reeves <weavejes...@googlemail.com> writes:
> > Higher-level tests? Do you mean unit testing?
>
> it would be nice to have test-helper functions that can do things
> like set up fake requests and make it easy to fake sessions. You can
> just def those in your tests of course, but I wouldn't know what kinds
> of things they would need to contain to really be a viable replacement
> for the jetty ones.

Your servlet definition should isolate you entirely from any Jetty or
Java specific classes. If you want to fake parameters, well they're
just a map of keywords and strings. If you want to fake a session,
it's just a (ref {}). Once you vacate defservlet, the idea is that
you're back in Clojure land.

The servlet could be tested by pushing through HttpServletRequest
proxies, or use a Java mocking tool to achieve the same effect. But as
the servlet is rather small, I'd be tempted to leap straight into end-
to-end testing with a tool like Watir, HtmlUnit, JWebUnit, Selenium or
whatever.

> Functions to test that a given HTML rendering contains what you're
> expecting would be great too, but that would require implementing
> a jQuery-style CSS-selector engine, which is probably a fair amount
> of work.

For testing HTML, I'd either parse it into a DOM and use XPath, or
keep it in vector format and test it that way.

> I was thinking you could use binding; that way the indentation level
> could be based on the previous value, but it would automatically
> decrement when you return from a function. You could bind it to zero
> within <pre>s.

That's a clever solution, and I'm envious I didn't think of it myself!
But it would still require some work, and as nice as formatted HTML is
to have, it's not really an itch I want to scratch. If there had been
a huge outcry over this, I'd probably have reconsidered, but if there
are no strong objections, I'm all for cutting it out.

And of course, nothing's stopping anyone who wants to from forking the
compojure.html library and developing it as an alternative rendering
engine :)

- James

James Reeves

unread,
Jan 10, 2009, 9:18:47 AM1/10/09
to Compojure

James Reeves

unread,
Jan 10, 2009, 9:40:17 AM1/10/09
to Compojure
On Jan 9, 3:45 pm, "Mark McGranaghan" <mmcgr...@gmail.com> wrote:
> I've just finished a fairly extensive writeup analyzing clj-html.core
> and compojure.html. I try to compare their syntax, understand their
> potential for interoperability, and quantify the performance tradeoff
> between the two:
>
> http://gist.github.com/45136

I'm surprised at how much quicker clj-html is. I'll see if removing
the formatting from compojure.html yields any faster benchmarks.

Supporting multiple classes using the CSS syntax is also something I
should put into compojure.html.

Thank you for very much for the write up. If it's okay, I might
summarize it for the Compojure wiki. For heavy traffic pages, clj-html
seems a highly desirable library to use.

> Some reasons:
> * I wanted domap-str and map-str to be consistent with each other and
> Clojure's do* convention, so str-map would imply dostr-map which
> seemed akward.

Agreed.

> * I was thinking about mapcat, where the outermost operator is on the
> right of the function name (but perhaps there are cases in Clojure
> core where it is the opposite?)

if-let also seems to follow this rule (let [x y] (if x ...)).

> * In general I like the -str suffix for functions that return or
> operate and strings, and -map for those that return or operate on
> maps.

That sounds sensible too.

Your reasoning seems good, and more in line with clojure.core. I'll
change the functions in Compojure to match.

- James

Mark McGranaghan

unread,
Jan 10, 2009, 10:25:38 AM1/10/09
to comp...@googlegroups.com
> I'm surprised at how much quicker clj-html is. I'll see if removing
> the formatting from compojure.html yields any faster benchmarks.

> Your reasoning seems good, and more in line with clojure.core. I'll
> change the functions in Compojure to match.

Sounds good James, glad you found the write-up helpful.

I mentioned earlier that I had implemented a (non-indenting)
'interpreting' HTML library as part of earlier work with clj-html.
Well I pulled that code up and tried it out again, and to my surprise
it seems to work quite well. I posted augmented performance figures to
the Gist; this interpreter seems to be about 6 times faster than the
Compojure one that has to deal with indentation. I've also pushed the
implementation code the 'interpreted' branch at my clj-html repository
at GitHub - mostly so that you can check it out if you want; you can
find the relevant code at the bottom of the core.clj file.

Thanks for continued input; let me know if you have any other ideas
about how we can improve the "html library story" for Compojure users.

- Mark

James Reeves

unread,
Jan 10, 2009, 4:58:59 PM1/10/09
to Compojure
On Jan 10, 3:25 pm, "Mark McGranaghan" <mmcgr...@gmail.com> wrote:
> I mentioned earlier that I had implemented a (non-indenting)
> 'interpreting' HTML library as part of earlier work with clj-html.
> Well I pulled that code up and tried it out again, and to my surprise
> it seems to work quite well. I posted augmented performance figures to
> the Gist; this interpreter seems to be about 6 times faster than the
> Compojure one that has to deal with indentation.

I went ahead and cut out the special formatting from Compojure. The
changes are in the github repository.

It turns out that removing the formatting from compojure.html cuts the
benchmark time by about 33%, so compojure.html is now just under 4
times slower than clj-html(i):

compojure.html : "Elapsed time: 4235.298133 msecs"
clj-html.core(i) : "Elapsed time: 1156.970401 msecs"
clj-html.core(c) : "Elapsed time: 157.695019 msecs"
StringBuilder : "Elapsed time: 112.887447 msecs"

compojure.html : "Elapsed time: 3297.444214 msecs"
clj-html.core(i) : "Elapsed time: 825.355256 msecs"
clj-html.core(c) : "Elapsed time: 75.37306 msecs"
StringBuilder : "Elapsed time: 40.47839 msecs"

I then tried running the compojure.html unit tests over clj-html.core
(i). There were a few minor problems:

1. The unit tests required the attribute hash to be sorted
2. compojure.html supports strings as tags as well as symbols and
keywords
3. compojure.html automatically escapes invalid characters in
attribute values

Putting in fixes for these resulted in clj-html.core(i) benchmarks
more than doubling:

compojure.html : "Elapsed time: 3133.183865 msecs"
clj-html.core(i) : "Elapsed time: 1800.562675 msecs"

compojure.html : "Elapsed time: 3110.445262 msecs"
clj-html.core(i) : "Elapsed time: 1686.866103 msecs"

Though there may be room for improve the efficiency of my quick fixes.

- James
Reply all
Reply to author
Forward
0 new messages