On 11/21/2011 05:19 AM, Roedy Green wrote:
> jTextArea.setText( someHTML ); takes FOREVER for an even moderately
> long document.
>
> It is parsing the HTML and turning it into some sort of tree structure
> for rendering.
>
> It would be nice if:
>
> 1. you could pre-parse the HTML and feed it the digested tree quickly
> to the JTextArea.
>
> 2. There were methods you could use to build a tree directly without
> going through HTML. The whole process of composing and rendering
> complex documents would be much faster.
>
> 3. Some browsers were taught to eat this stuff and render compact
> compressed pre-parsed pages very quickly.
>
> 4. Gradually text-HTML pages would disappear to be replaced by such
> trees that CAN'T have syntax errors, at least not ones caused by
> webdesigner error.
Parsing HTML is hardly a very time-consuming action in general. In my
experience using something like nu.validator.htmlparser.sax.HtmlParser
allows parsing of HTML about as quickly as one could parse XML.
JTextArea HTML rendering has been broken from the beginning. I haven't
used it for a couple of years now but I don't remember reading anywhere
that lots of effort has been put into this lately.
I would guess that most of the time is spent rendering the HTML, not
parsing it. Especially when the HTML markup contains a bunch of (nested)
tables it can easily bring a poorly written renderer to a grinding halt.
This description certainly fits JTextArea as I remember it.
I hope you are not advocating a regression from text-based protocols
back into binary crap with your last point? All things taken into
consideration XHTML + CSS is pretty optimal for the task at hand.