Thanks for the 2 patches

Jonathan Revusky

unread,

May 26, 2008, 10:15:18 AM5/26/08

to bria...@gmail.com, kawadd...@googlegroups.com

Brian,

Thanks for the patches. I applied them. I was very happy to see that you
were in the code.

As for the generifying of the code, you saw that it is far from
complete. There are still unnecessary casts all over the place and older
loops that could be rewritten much more tersely and nicely with the
extended for loop and so on. In fact, if you want to continue doing
that, I have no reason to object at all and, if you want, I'll add you
as a developer on the project, so you can commit that stuff.

One thing I would point out though is that there is code that will
eventually be replaced by templates, like pretty much any method that
contains a lot of ostr.println(...) things so there is probably little
point in fixing up those methods. (A lot of them have names like
DumpXXXX and so on.) As you can see, the templatization is only partial
so far. It's really a bear to go through and systematically replace all
that println stuff, largely because that code is just so hard to read. I
keep confusing the {} delimiters in the generating code versus the
delimiters in the generated code.

Anyway, thanks again. :-)

JR

brianegge

unread,

May 28, 2008, 8:41:18 AM5/28/08

to KawaDD Development

Yep, I know the println stuff will be going away. I figured I'd
submit a few small patches before doing anything important.

I've used JavaCC w/ Freemarker for code generation before, and found
the two work quite well together, so I'm keen to see Freemarker be
intergrated with a parser.

WRT to static options, and other optimizations, Tom C did a number of
time/space tests in his JavaCC book. My opinion is if time/space is
important, one can probably get the most improvements by optimizing
the grammar. I'm happy to see methods like 'reInit' go away.

Jonathan Revusky

unread,

May 28, 2008, 11:37:48 PM5/28/08

to kawadd...@googlegroups.com

On Wed, May 28, 2008 at 2:41 PM, brianegge <bria...@gmail.com> wrote:
>
> Yep, I know the println stuff will be going away.

Yes, it stands to reason, but I thought it best to tell you just in case. :-)

> I figured I'd
> submit a few small patches before doing anything important.

Hmm, any itches to scratch?

It should eventually be possible to actually do some interesting
things once this cleanup that is under way is really more or less
completed. It's still partial, but I have to think I'm past the
halfway point on it. If I put the same energy in that I have up to
this point, I think it will really be possible to think about ways to
enhance the tool.

Here's one little metric I just checked. I did a grep just now for
ostr.println in the code and it's down to 656 matches, from 3295 in
the JavaCC CVS. Just getting all that stuff out of the java code, at
a minimum, allows you to see what is left, and it gives you some
ability to start looking at the actual machinery. WIthout that, you
really just get scroll blindness paging through all that endless
ostr.println() stuff.

> I've used JavaCC w/ Freemarker for code generation before, and found
> the two work quite well together, so I'm keen to see Freemarker be
> intergrated with a parser.

What I find is that bringing a template engine into the picture
provides (for me at least) some basic way of thinking about what the
tool does. It reads in the grammar, it builds up certain data
structures, and then, it applies those data structures to a set of
templates, thus generating the code for the lexer and parser. In the
existing codebase, when you read the code, there is this terrible
confusion (at least for me) about what the relationships are between
different parts of the code. At various points, I was just
experimenting with changing the order of execution of different
pieces, just to see whether the tool still worked -- you know, to
deduce which pieces depended on which other pieces having run before
them.

Anyway, once you think about it this way, that the tool builds up a
representation of the grammar and that's merged with a set of
templates, then there is a certain eureka moment, because you realize
that javacc, jjtree, and jjdoc are all basically the same thing. I
mean, jjdoc simply is a different set of templates that generate an
HTML (or other) kind of output. JJTree, in principle, is just having
templates that include tree-building code. So, in essence, having a
tree built (or not) is just equivalent to turning on (or off) certain
sections of the template. At least that's the way it should be, I
think. And basically, I think that jjtree as a separate tool should
just melt away. After all, jjtree's grammar is exactly javacc's
grammar, but with extra optional tree-building annotations.

Finally, it seems to me (though maybe I'm crazy) that, by default, a
tool like this might as well just do all three things, generate a
parser, put default tree-building actions in there, and also generate
navigable docs. Generating docs is just a question of merging the same
data you use for the parser/lexer with a different template (or
possibly templates).

>
> WRT to static options, and other optimizations, Tom C did a number of
> time/space tests in his JavaCC book.

Well, almost none of the execution options n JavaCC strike me as being
very worthwhile. You know, it's like, okay, you can have a static
parser, and you avoid virtual method invocation overhead, and maybe it
makes your code 5% faster or maybe even 10%, except now it's not
thread-safe. Or you can reduce memory usage a little tiny bit by
setting KEEP_LINE_COL=false, and then your Token objects don't have
beginColumn, endLine, etc. in them. Like, okay, my parser runs in a
little bit less memory except that now when anybody sends a stack
trace and I need to troubleshoot, there is no location information.
When is a trade-off like that going to be worth it? So, I'm thinking
about whacking that one too.

So those options have basically no real value, but they make the tool
look more daunting.

> My opinion is if time/space is
> important, one can probably get the most improvements by optimizing
> the grammar.

The thing is that if your lexer or parser code really is a bottleneck
in some larger system, dragging the whole thing down (and that might
be pretty rare anyway) but if that was the case, it it does not seem
to me that the extra bit you could get out of, say, reusing the same
XXXParser and XXXParserTokenManager objects, as opposed to just
creating new ones, would make a crucial difference.

>I'm happy to see methods like 'reInit' go away.

Well, so far, we think alike. I just whacked all the ReInit() methods
in the codebase. :-)

Cheers,

JR

> >
>

Reply all

Reply to author

Forward