FreeCC progress - Cleanup moving towards standard naming conventions

4 views

Skip to first unread message

Jonathan Revusky

unread,

Oct 31, 2008, 6:34:49 AM10/31/08

to freecc...@googlegroups.com

A new FreeCC release (0.9.2) will almost certainly be available at some
point next week. On the technical level, it will be fairly incremental
in nature. My goal for this release will be to nail down certain naming
conventions for FreeCC. I'm writing this note to give people a chance to
have some input into this. Here are the various strands, not necessarily
in order of importance:

1. My intention is to establish the convention that a grammar file
called Foo.jj will generate a parser class called FooParser and a lexer
(a.k.a. TokenManager) class called FooLexer. I consider
FooParserTokenManager to be long-winded and rather ugly. The convention
of FooParser/FooLexer can nonetheless be overridden via the new options
PARSER_CLASS and LEXER_CLASS. So, for example, if you want to continue
calling your lexer class FooParserTokenManager, you simply add the line
LEXER_CLASS="FooParserTokenManager" to your options at the top of the
grammar.

2. The JavaCC constructs PARSER_BEGIN....PARSER_END and TOKEN_MGR_DECLS
: {...} will be deprecated in favor of the new INJECT_CODE construct
described here:

http://code.google.com/p/freecc/wiki/CodeInjection

I think it is pretty clear once you look at code injection that there is
no reason to have these separate constructs for injecting code into the
generated parser and lexer classes. PARSER_BEGIN/PARSER_END and
TOKEN_MGR_DECLS are really just specific cases of code injection and may
as well be specified using the same INJECT_CODE construct.

The deprecated constructs will continue to work, but the preferred way
of writing TOKEN_MGR_DECLS, for example, will be
INJECT_CODE(LEXER_CLASS) : ....

3. The whole NODE_PREFIX business is going to be deprecated. I am
referring to the feature in JJTree where the node classes (when
MULTI=true) are, by default, called ASTFoo, ASTBar, etcetera. It is
admittedly a question of personal taste, but I find this to be rather
ugly and annoying. I have devoted some thought to this and I cannot find
any real purpose to this. In a language with no namespaces, like C,
let's say, this kind of ASTXXX naming pattern makes some sense because
you avoid name conflicts. However, it makes little sense in Java. That
is what packages are for. So, the preferred coding convention in FreeCC
will be just to specify a NODE_PACKAGE="foo.bar.mynodes" and the various
nodes will be generated in that package and there is no naming conflict
issue.

Well, it should be clear that what I'm talking about are defaults. If
you actually like the older convention, you just add put
NODE_PREFIX="AST" in your grammar options and it will work as before.
So, basically, this just amounts to changing the default of NODE_PREFIX
from "AST" to the empty string. Technically, this is beyond trivial, of
course, but the thing is that, as a practical matter, 98% of users stick
with the defaults in a tool, so it is important to establish a sane
default configuration. That's the focus of this coming release.

So, my current focus prior to the next release is just to document the
above conventions and tweak all the various included examples so that
they follow those conventions.

In other matters, I think that I've decided that this will be the last
release that will generate java code that is compatible with JVM's older
than 1.5.x. You know, I just recently hit this page on Wikipedia:

http://en.wikipedia.org/wiki/Java_version_history

Apparently J2SE 5.0 was released in September of 2004. 4 years ago. That
is really a very long time in this field. In fact, apparently, J2SE 5.0
is not even supported officially by Sun. The supported J2SE version is
J2SE 6.0, and that has already been out for 2 years. But what's more is
that if you really have to target an older JVM, there is the
retrotranslator tool that, in my experience, works extremely well. So,
really, given all that, it does just seems silly and pointless to saddle
oneself with a commitment to maintain backward compatibility with JDK
1.4. I have to think that FreeCC will mostly be adopted for new
projects, and really, who is specifically starting a new project where
compatibility with JDK 1.4 is a requirement? There may be cases but...
not many...

Anyway, that also means that the following release after this coming
one, which will be 0.9.3, will almost certainly do away with
XXXConstants and use type-safe enums, and also that the node classes
will be reworked to use generics.

Anyway, that's where things are at now. I encourage anybody who has an
opinion about the above, to express it. It is preferable to do so on the
freecc-devel list. You can subscribe here:

http://groups.google.com/group/freecc-devel

I am cc'ing this to the javacc users list because it seems right to give
members of the broader javacc user community the opportunity to have
some input in the decisions being made at this specific point. For
example, if you disagree with some of the above and have better ideas,
then there is a window of opportunity to have some influence, but of
course, at some point soon, that window will close.

Regards,

Jonathan Revusky
--
FreeCC Parser Generator http://code.google.com/p/freecc
lead developer, FreeMarker project, http://freemarker.org/

Michael Norman

unread,

Oct 31, 2008, 8:52:11 AM10/31/08

to freecc...@googlegroups.com

my comments inline ...

On Fri, Oct 31, 2008 at 6:34 AM, Jonathan Revusky <rev...@gmail.com> wrote:

[MWN - some lines deleted ...]

A new FreeCC release (0.9.2) will almost certainly be available at some point next week

1. My intention is to establish the convention that a grammar file
called Foo.jj will generate a parser class called FooParser and a lexer
(a.k.a. TokenManager) class called FooLexer. I consider
FooParserTokenManager to be long-winded and rather ugly. The convention
of FooParser/FooLexer can nonetheless be overridden via the new options
PARSER_CLASS and LEXER_CLASS.

I agree 100% .. and agree as well with the use of overrides that can maintain
backwards compatibility with old Javacc naming conventions.

2. The JavaCC constructs PARSER_BEGIN....PARSER_END and TOKEN_MGR_DECLS
: {...} will be deprecated in favor of the new INJECT_CODE construct
described here:

http://code.google.com/p/freecc/wiki/CodeInjection

Is this a 'guice'-style of injection?

...
3. The whole NODE_PREFIX business is going to be deprecated.

yeah!

In other matters, I think that I've decided that this will be the last release that
will generate java code that is compatible with JVM's older than 1.5.x.

It is a hard decision to make (we just went through this at work).
Question - how long would the 0.9.2 stream be supported? Perhaps some statement
about the types of bugs that would be fixed in the < 1.5 stream.
New > 1.5 development would be in a 1.X stream?

Regards,

Jonathan Revusky
--
FreeCC Parser Generator http://code.google.com/p/freecc
lead developer, FreeMarker project, http://freemarker.org/

Thanks for all your hard work,

Mike Norman,
lurker extraordinaire ... and now that I've 'popped-up', I'm going back into my gopher hole ;-)

Jonathan Revusky

unread,

Oct 31, 2008, 3:59:02 PM10/31/08

to freecc...@googlegroups.com

On Fri, Oct 31, 2008 at 1:52 PM, Michael Norman <mwno...@gmail.com> wrote:
> my comments inline ...
> On Fri, Oct 31, 2008 at 6:34 AM, Jonathan Revusky <rev...@gmail.com> wrote:
>>
>> [MWN - some lines deleted ...]
>>
>> A new FreeCC release (0.9.2) will almost certainly be available at some
>> point next week
>>
>> 1. My intention is to establish the convention that a grammar file
>> called Foo.jj will generate a parser class called FooParser and a lexer
>> (a.k.a. TokenManager) class called FooLexer. I consider
>> FooParserTokenManager to be long-winded and rather ugly. The convention
>> of FooParser/FooLexer can nonetheless be overridden via the new options
>> PARSER_CLASS and LEXER_CLASS.
>
> I agree 100% .. and agree as well with the use of overrides that can
> maintain
> backwards compatibility with old Javacc naming conventions.

Well, people should be able to call things whatever they want, I
suppose, but I think that 98% of people will just follow the default
XXXParser/XXXLexer convention, and will be better off for it. I think
that practically speaking, when it comes to configuration versus
convention, convention is a lot more practical. Aside from the
annoyance of maintaining big configuration files, there is just the
basic advantage that if everybody follows certain patterns, then when
they look at somebody else's project that uses a tool like FreeCC,
they know where to look for things. You know, it's like somebody
downloads this thing and the source code is in the src directory and
the launch scripts are in the bin directory. Stuff like that.

>
>>
>> 2. The JavaCC constructs PARSER_BEGIN....PARSER_END and TOKEN_MGR_DECLS
>> : {...} will be deprecated in favor of the new INJECT_CODE construct
>> described here:
>>
>> http://code.google.com/p/freecc/wiki/CodeInjection
>
> Is this a 'guice'-style of injection?

I don't know. I don't know what that is. You'd have to tell me whether
it is or not. (Probably it isn't...)

>>
>> ...
>> 3. The whole NODE_PREFIX business is going to be deprecated.
>
> yeah!

Yeah, well, I think it's pretty obvious that the whole AST node prefix
thing must come from having copied the naming convention from some
tool that generated C code. I mean, if you're generating C, and your
Foo production causes a function called Foo() to be generated, then I
guess you can´t generate a struct called Foo because the name is
already used for the function. But in Java, even aside from the fact
that you can put all the generated nodes in their own package, even
with everything in the same package, the names of classes within a
package and the names of methods within a specific class are already
completely separate namespaces. So when you go from C to Java, the
whole ASTXXX thing is completely unnecessary.

>
>> In other matters, I think that I've decided that this will be the last
>> release that
>> will generate java code that is compatible with JVM's older than 1.5.x.
>
> It is a hard decision to make (we just went through this at work).
> Question - how long would the 0.9.2 stream be supported?

Well, my thinking on this was simply to point people to
retrotranslator. Check it out at http://retrotranslator.sf.net .
Basically, this allows you to work with the current Java language and
when you need to deploy your classes in a 1.4 JVM, you just run
retrotranslator over your classes and it generates a .jar file
compatible with older JVM's. So I figured that if there really was a
lot of demand for this, I could just bundle retrotranslator with
freecc and have a separate launch script, freecc-retro, that runs
freecc and then compiles the classes and retrotranslates them.

> Perhaps some
> statement
> about the types of bugs that would be fixed in the < 1.5 stream.
> New > 1.5 development would be in a 1.X stream?

Well, I don't really anticipate maintaining a separate stream of
development that is JDK 1.4 compatible. I am open to persuasion, but
my honest sense of things is that it's not worth it. Now, since the
code generation is all template-based, it will be feasible to simply
maintain the older set of templates that generate 1.4 compatible
source code. But I think my position is that if you or anybody else
wants to take ownership of that, and maintain separate templates for
generating 1.4 source, and keep them in synch as with the project, as
it develops, then, by all means, welcome aboard. But let's be honest,
people can tell me that it is important to maintain these templates
that can generate 1.4 source, but if it comes to them doing the work,
then, you know, it's bound to be a different story. ;-) Actually,
maintaining separate 1.4-compatible templates is not even a huge
amount of work probably. OTOH, it's not very interesting work and I am
doing this open source stuff mostly just for the fun of it, so... :-)
Though aside from that, I don't think that my devoting my time to that
is even in the interest of the (eventual) user community, because the
extra effort maintaining 1.4 compatible code generation templates is
time that would otherwise be spent actually adding interesting new
functionality to the tool.

Anyway, I think the reality is that if Sun itself is not officially
supporting JDK 1.4 any more, given their resources, it's a bit much to
think that I should. :-) Finally, it really seems to me that being
able to run modern Java code on the older JVM's is a niche need that
is being catered to extremely well by Taras Puchko (he's the author of
retrotranslator.) and I think that henceforth, that should be the real
answer. Standard FAQ: "Can I use my FreeCC generated parser on a 1.4
JVM?" Answer: "Yes. Generate your .jar file in the normal way and then
use retrotranslator to generate a 1.4 compatible jar and deploy that".
And actually, if you replace the "my FreeCC generated parser" in the
above with your employer's product, whatever it is, the same answer
probably serves. So, if your higher-ups there aren't aware of this,
you ought to point it out to them, I'd say.

>

> Thanks for all your hard work,
>
> Mike Norman,
> lurker extraordinaire ... and now that I've 'popped-up', I'm going back into
> my gopher hole ;-)

Well, thanks for showing up. OTherwise, one does just get this
"shouting into the void" sort of feeling when you write these things
and there is no response. It's to be expected when you start off a new
project from scratch. I'm pretty optimistic that matters will improve.
It takes a while for word to get out.

Regards,

JR

>
> >
>

Attila Szegedi

unread,

Nov 3, 2008, 7:31:17 AM11/3/08

to freecc...@googlegroups.com

On 2008.10.31., at 11:34, Jonathan Revusky wrote:

> In other matters, I think that I've decided that this will be the last
> release that will generate java code that is compatible with JVM's
> older
> than 1.5.x. You know, I just recently hit this page on Wikipedia:
>
> http://en.wikipedia.org/wiki/Java_version_history
>
> Apparently J2SE 5.0 was released in September of 2004. 4 years ago.
> That
> is really a very long time in this field. In fact, apparently, J2SE
> 5.0
> is not even supported officially by Sun.

Java 5 is in EOL *transition* period, but until the transition ends
(October 30, 2009, in a year), it is supported. Latest 5.0 release was
in July this year (after it entered EOL transition), and they'll
continue to release versions of it until next October. According to <http://java.sun.com/javase/downloads/index_jdk5.jsp
>:
"J2SE 5.0 is in its Java Technology End of Life (EOL) transition
period. The EOL transition period began April 8th, 2008 and will
complete October 30th, 2009, when J2SE 5.0 will have reached its End
of Service Life (EOSL)."

1.4.x, OTOH, incidentally reached its end-of-life few days ago
(October 30, 2008) :-)

> The supported J2SE version is
> J2SE 6.0, and that has already been out for 2 years. But what's more
> is
> that if you really have to target an older JVM, there is the
> retrotranslator tool that, in my experience, works extremely well. So,
> really, given all that, it does just seems silly and pointless to
> saddle
> oneself with a commitment to maintain backward compatibility with JDK
> 1.4. I have to think that FreeCC will mostly be adopted for new
> projects, and really, who is specifically starting a new project where
> compatibility with JDK 1.4 is a requirement? There may be cases but...
> not many...
> Anyway, that also means that the following release after this coming
> one, which will be 0.9.3, will almost certainly do away with
> XXXConstants and use type-safe enums, and also that the node classes
> will be reworked to use generics.

I'm in two minds about this. On one hand, I can see why would you want
that -- nicer, more modern output, right?
On the other hand, I'm not sure what exactly does that buy you. The
generated code is just that, generated code; something intermediate
generated from the grammar source and then fed to javac. A developer
writing the grammar file and then using the generated parser doesn't
benefit much from the fact it's utilizing Java 5 features, does he?

Anyway, if you see real benefits for the developer in this, sure, go
ahead.

Attila.

Jonathan Revusky

unread,

Nov 3, 2008, 1:45:13 PM11/3/08

to freecc...@googlegroups.com

Attila Szegedi wrote:
> On 2008.10.31., at 11:34, Jonathan Revusky wrote:
>
>> In other matters, I think that I've decided that this will be the last
>> release that will generate java code that is compatible with JVM's
>> older
>> than 1.5.x. You know, I just recently hit this page on Wikipedia:
>>
>> http://en.wikipedia.org/wiki/Java_version_history
>>
>> Apparently J2SE 5.0 was released in September of 2004. 4 years ago.
>> That
>> is really a very long time in this field. In fact, apparently, J2SE
>> 5.0
>> is not even supported officially by Sun.
>
> Java 5 is in EOL *transition* period, but until the transition ends
> (October 30, 2009, in a year), it is supported. Latest 5.0 release was
> in July this year (after it entered EOL transition), and they'll
> continue to release versions of it until next October. According to <http://java.sun.com/javase/downloads/index_jdk5.jsp
> >:
> "J2SE 5.0 is in its Java Technology End of Life (EOL) transition
> period. The EOL transition period began April 8th, 2008 and will
> complete October 30th, 2009, when J2SE 5.0 will have reached its End
> of Service Life (EOSL)."

Maybe the wikipedia page is wrong. Or maybe I misread it. OTOH, it
doesn't particularly affect my view of the situation. The key fact is
still that J2SE 5.0 has had a GA release for 4 years now. That is really
a very very long time. At least IMHO.

If you download the latest release of JavaCC, and look under their
example java grammars, the README file says that the Java 1.0.2 grammar
is the really well tested stable grammar but they've got a 1.1 grammar
and they also include an "experimental" grammar for the JDK 1.5 "draft
proposal". Or something like that. Really nutty basically. As if anybody
is interested in a Java 1.0.2 grammar, which would not even handle inner
classes, for example.

I just deleted all the pre-1.5 java grammars because I just cannot see
why anybody would be interested. The Java 1.5 parser can parse older
java code anyway, and if you want to prevent people from using 1.5
constructs, you just stick in actions that throw exceptions on the newer
1.5 constructs.

>
> 1.4.x, OTOH, incidentally reached its end-of-life few days ago
> (October 30, 2008) :-)

And this really would make anybody question whether supporting it is now
worthwhile...

>
>> The supported J2SE version is
>> J2SE 6.0, and that has already been out for 2 years. But what's more
>> is
>> that if you really have to target an older JVM, there is the
>> retrotranslator tool that, in my experience, works extremely well. So,
>> really, given all that, it does just seems silly and pointless to
>> saddle
>> oneself with a commitment to maintain backward compatibility with JDK
>> 1.4. I have to think that FreeCC will mostly be adopted for new
>> projects, and really, who is specifically starting a new project where
>> compatibility with JDK 1.4 is a requirement? There may be cases but...
>> not many...
>> Anyway, that also means that the following release after this coming
>> one, which will be 0.9.3, will almost certainly do away with
>> XXXConstants and use type-safe enums, and also that the node classes
>> will be reworked to use generics.
>
> I'm in two minds about this. On one hand, I can see why would you want
> that -- nicer, more modern output, right?
> On the other hand, I'm not sure what exactly does that buy you. The
> generated code is just that, generated code; something intermediate
> generated from the grammar source and then fed to javac. A developer
> writing the grammar file and then using the generated parser doesn't
> benefit much from the fact it's utilizing Java 5 features, does he?

That's true for generated code that is only used internally.
Nonetheless, the parser machinery is exposing an API -- in particular if
you are using the tree-building functionality. I suspect that all the
tree-building node API could benefit significantly from the use of
generics. Also, all the static final int constants in XXXConstants could
become type-safe enums. Aside from the type safety advantages, an Enum
is an object that can have methods, i.e. you can inject code inside
them. So, even though I admit I haven't fleshed it out, I have the
intuition that some much cleaner code patterns could be available this way.

>
> Anyway, if you see real benefits for the developer in this, sure, go
> ahead.

It may be marginal. I haven't done anything yet in terms of generating
1.5 specific code.

Two loose ends I didn't mention are:

(1) Changing TokenMgrError to LexicalException (naming consistent with
ParseException). (I've already done this.)

(2) What should the standard extension for FreeCC grammar files be? I
suppose .fcc would be reasonable. Is there some other well known file
type that ends that way? Or it could be the longer .freecc even.

So far I have left the files as .jj and .jjt. The extension is not
actually used anyway, so it's just a convention. But I would change all
the examples in the distro to that naming.