For starters, kind of trivially, I figured I'd just adopt .freecc as the
default extension for FreeCC files. So I've changed all the various
filenames in the examples to that. Of course, that doesn't really have
any backward compatibility implications or anything.
Now, finally, I think that, even at the cost of a tiny bit more backward
incompatibility with JavaCC, I prefer to smooth out certain things,
since it will be better for the long haul. So, in that vein, I renamed
XXXParserTokenManager to XXXLexer and TokenMgrError to LexicalException.
I am considering renaming SimpleNode to BaseNode, which is really more
of an apt description of what it is. SimpleNode isn't necessarily
simple, since you can inject as much code into it as you want... Really,
the class contains the base functionality to implement the Node
interface. I think BaseNode is a better name for that.
As for the new statements in FreeCC, INJECT_CODE and INCLUDE_GRAMMAR,
they are both rather verbose. (See
http://code.google.com/p/freecc/wiki/CodeInjection and
http://code.google.com/p/freecc/wiki/GrammarInclusion for a description
of these things.)
I am considering changing INJECT_CODE to just INJECT (or maybe INSERT is
better) and INCLUDE_GRAMMAR to just INCLUDE. It's still early enough in
the game to do things like this. (I don't want to be frivolously
renaming things too much as things move forward obviously, so I'd like
to nail it down now.) I recently realized that not only is
INCLUDE_GRAMMAR rather verbose but it's also a misnomer, since the file
you include does not have to stand on its own as a grammar (though
obviously I am not going to change that to INCLUDE_GRAMMAR_FRAGMENT...
;-)) I foresee that one possibility that could frequently happen in real
life is that the included file might only contain code injections, for
example. That got me to thinking about when you would use an include
mechanism to keep injected code snippets in a separate file and when you
wouldn't. I tend to think that if you're just adding a getter/setter or
two to a Node object, then you probably want to put it in the main
grammar file quite near the related bnf production. OTOH, if you are
adding a lot more code, especially if it's algorithmically involved, you
probably want to maintain it separately as an INCLUDE.
It finally occurred to me that you might well want to be able to include
java source files, and FreeCC will read in the java source and inject
the declarations in it into the generated source file. So, you could
well just want to have a FooParser.java that contains some methods that
will be injected in the _generated_ FooParser.java. The main advantage
is that you would be able to edit that java source in a java-aware
editor or IDE. Of course, if there was a FreeCC eclipse plugin that
allowed you to edit embedded java snippets inside a FreeCC grammar with
the same facilities you have for editing .java files (syntax
highlighting, navigation, code folding, etc.) then this would not be so
important, but... we don't have that... yet.
Anyway, then the question becomes whether to introduce two separate
INCLUDE statements, one for source files and the other for grammar files
(or grammar file fragments more accurately) or whether this should just
be deduced from the file extension. I'm tending towards the latter, just
inferring the file type that is being included from the extension, and
if it becomes necessary to have the option of specifying, there is no
problem in introducing an optional specifier. So, if for some reason,
you want to keep some java source code in a foo.bar file, you would have
the option of specifying it via something like
INCLUDE ("foo.bar" ; java-src)
except 99% of the time, the java code has the .java extension, so you
could just write:
INCLUDE ("foo.java")
and the tool would infer the content type of the file. But I mean to say
that the explicit file type specifier could be added later with no
backward compatibility issues if there was a proven real-world need for it.
Anyway, things are at a stage where all these dangling loose ends are
being dealt with. At the moment, there is a window of opportunity open
to have some input into this, but that window will shut at some point.
If anybody wants to discuss any of this stuff, they can do so on the
freecc-devel list. You can subscribe to it here:
http://groups.google.com/group/freecc-devel
Regards,
Jonathan Revusky
--
FreeCC Parser Generator http://code.google.com/p/freecc
lead developer, FreeMarker project, http://freemarker.org/
On Thu, Nov 6, 2008 at 2:49 PM, Andy Streich <trin...@gmail.com> wrote:
> I'm brand new to the FreeCC idea and this group, but have been a JavaCC user
> for a long time.
Have you been using JavaCC alone, or do you also use JJTree?
Anyway, I wouldn't say there is really a " FreeCC idea" as opposed to
a "JavaCC idea". This is really just the resumption of work on JavaCC,
which, you know as a longtime user, is basically inactive
development-wise. In a more sane world, the FreeCC project wouldn' t
exist, since all this work would just be in the JavaCC project. The
fork was necessary because the people in control of the JavaCC project
were just impossible.
Actually, most of the work I've done so far in FreeCC has been a
massive code cleanup/refactoring. It' s in a state where, as we become
aware of desirable new features, they can be implemented fairly
easily. That wasn't possible with the JavaCC codebase, since it is
really so horribly entangled.
> With that caveat, I just wanted to say that the name
> changes look good to me
That's good to know. BTW, what do you think of FooLexer versus
FooTokenizer? Maybe Tokenizer is a better name than lexer, because for
the absolute newbie it says that this is where the Tokens come from.
Even XXXTokenSource is not too bad a name. TokenManager is a very odd
name because. aside from being such a mouthful, this object does not
in any sense "manage" the Tokens. It just spits them out is all. This
is all picky stuff, of course, but, you know, people have a hard time
as it is developing a conceptual model of how a tool like this works.
There is no need to make it worse by naming things in confusing ways.
Oh, and feel free to opine on:
INJECT_CODE
vs.
INJECT
vs.
INSERT
(The code injection feature is described here:
http://code.google.com/p/freecc/wiki/CodeInjection )
> (I don't have any JavaCC compatibility concerns as
> long as FreeCC emits Java code that doesn't depend on external libraries).
I'm inferring when you say Java code, you mean the current version of
the Java language. Do you see any particular need for the code to
compile with JDK version < 1.5?
> Also want to note that in the end we want FreeCC plugics for Eclipse *and*
> NetBeans.
Yeah, sure, but I don't know when this will happen. Do you have any
experience writing NetBeans plugins?
Thanks,
JR
> I'm brand new to the FreeCC idea and this group, but have been a JavaCC userHave you been using JavaCC alone, or do you also use JJTree?
> for a long time.
Anyway, I wouldn't say there is really a " FreeCC idea" as opposed to
a "JavaCC idea". This is really just the resumption of work on JavaCC,
which, you know as a longtime user, is basically inactive
development-wise. In a more sane world, the FreeCC project wouldn' t
exist, since all this work would just be in the JavaCC project. The
fork was necessary because the people in control of the JavaCC project
were just impossible.
Actually, most of the work I've done so far in FreeCC has been a
massive code cleanup/refactoring.
It' s in a state where, as we become
aware of desirable new features, they can be implemented fairly
easily.
... what do you think of FooLexer versus
FooTokenizer? Maybe Tokenizer is a better name than lexer, because for
the absolute newbie it says that this is where the Tokens come from.
Even XXXTokenSource is not too bad a name. TokenManager is a very odd
name because. aside from being such a mouthful, this object does not
in any sense "manage" the Tokens. It just spits them out is all.
This
is all picky stuff, of course, but, you know, people have a hard time
as it is developing a conceptual model of how a tool like this works.
There is no need to make it worse by naming things in confusing ways.
Oh, and feel free to opine on:
INJECT_CODE
vs.
INJECT
vs.
INSERT
Do you see any particular need for the code to
compile with JDK version < 1.5?
> Also want to note that in the end we want FreeCC plugics for Eclipse *and*> NetBeans.Yeah, sure, but I don't know when this will happen. Do you have any
experience writing NetBeans plugins?
At the moment, if you just use JavaCC functionality, FreeCC presents
fairly little difference with JavaCC. I mean, in terms of things that
a user notices. Under the hood it is very different. OTOH, the JJTree
side already has a lot of new functionality in FreeCC.
Actually, as it turns out, I didn't do anything with the JJTree code.
I just ended up throwing it out. You see, once I had refactored
JavaCC to use templates for code generation, it was fairly simple to
enhance the templates so that they conditionally put in the
tree-buidling code. With the code generation based on templates,
re-implementing the JJTree functionality as part of FreeCC was
literally a one-day project. And, from there, it was quite easy to
enhance the JJTree functionality in various fairly common-sensical
sorts of ways.
Probably, in terms of the JavaCC core functionality, what I will look
at, at some point soon, is introducing some conventions/patterns for
error handling/recovery/backtracking sorts of issues. It's not
impossible to do these things with JavaCC, there's even a document
about doing error handling, but it's all extremely kludgy and ad hoc
to my taste. I have the feeling that this can be improved quite a bit,
and may not be extremely difficult to do from this point.
Of course, the really big thing I want to do eventually is to have
alternative versions of the templates for generating other languages
besides Java. But probably I'll turn to that when most of the current
low-hanging fruit is picked.
>
>>
>> Anyway, I wouldn't say there is really a " FreeCC idea" as opposed to
>> a "JavaCC idea". This is really just the resumption of work on JavaCC,
>> which, you know as a longtime user, is basically inactive
>> development-wise. In a more sane world, the FreeCC project wouldn' t
>> exist, since all this work would just be in the JavaCC project. The
>> fork was necessary because the people in control of the JavaCC project
>> were just impossible.
>
> Thus the beauty of FLOSS. Glad to see you've taken the initiative to push
> things forward.
Well, FLOSS does allow anybody who is willing and able to do some work
on something to do it, and to make it publicly available. Of course,
there will be cases where one person or group wants to take things in
one direction and other people want to move in a different direction.
I think it should be a basic open source principle, however, that a
fork should only occur when there is a legitimate, good-faithed
difference of opinion about how development should proceed. It can't
just be that one side wants to do something and the other people want,
quite literally, to do nothing. Another way of putting this is that
one's leadership/management of a FLOSS project cannot consist solely
in nixing any new ideas anybody proposes. One doesn't have to agree
with everything or even anything that is proposed, but then you have
to have some ideas yourself. (IMHO.... :-))
I don't honestly know whether the above is a common FLOSS principle
that anybody has ever set down formally anywhere. If not, it should
be. One would think it was such common sense that you wouldn't need to
state it, but apparently not. The basic problem with the situation --
I mean a fork taking place like this when there is no legitimate
technical reason for it -- is that, yes, I can do the work and make it
available, but, at least in the near term, far fewer people will get
the benefits of that work than would be the case if there was no fork.
This is because, after having existed for over a decade, JavaCC has
name recognition. Nobody knows about FreeCC and that will be the case
for at least a good while longer.
>>
>> Actually, most of the work I've done so far in FreeCC has been a
>> massive code cleanup/refactoring.
>
> ...which is extremely useful for new projects.
>
>>
>> It' s in a state where, as we become
>> aware of desirable new features, they can be implemented fairly
>> easily.
>
> That's wonderful. I've haven't yet looked into the QA side of FreeCC and
> will have to find the time to do so. With this fork, it's naturally
> important to anyone using the new version that the QA level is being
> maintained or improved. Trust that it is. Just stating the obvious.
Well, I think that, as a practical matter, what gives people a high
level of confidence in well known open source projects is in fact that
they are well known and widely used. This is kind of recursive in
nature, of course. But I mean, if you take FreeMarker as an example,
it is used so widely in so many different contexts, with so few real
problems (the vast majority of freemarker-related problems reported
actually have nothing to do with freemarker) that I think a reasonable
person would feel a pretty high level of confidence in depending on
it. The FreeMarker project does not really have a great test suite on
the other hand. It's probably about good enough to catch really
glaring sorts of problems in a build, but plenty of more subtle bugs
get through our test suite on occasion.
I think it's the extremely high level of real-world usage of
FreeMarker that means that bugs have a very low life expectancy. Well,
that, and the fact that we do tend to fix the bugs. For example, our
main "competitor", Apache Velocity also has a very high level of
usage, so bugs are exposed, but basically they never fix them. Very
typically, they'll just say that whatever it is is a documented
behavior, therefore it's not a bug.
As for FreeCC, it still has most of the test suite that was part of
JavaCC. In fact, I've gradually adjusted it so that the grammars in
there used FreeCC conventions. OTOH, it's quite a poor test suite,
much poorer than the FreeMarker one even. It is only good enough to
catch very glaring problems in a build. Most of the time, where I've
introduced bugs in development, at least anything that was at all
subtle or tricky, it was caught when I tried to use it to rebuild
freemarker. Or in other cases, I found the bug when I tried to
bootstrap the build to rebuild FreeCC itself.
In any case, it is quite true that the probability of any new
significant bugs being introduced in JavaCC is basically zero, but
that's only because the project is stagnant. That also means that if
you ever have any requirement for a feature in JavaCC that is not
currently implemented, it never will be. Never.
>> ... what do you think of FooLexer versus
>> FooTokenizer? Maybe Tokenizer is a better name than lexer, because for
>> the absolute newbie it says that this is where the Tokens come from.
>
> IMO people using a tool like this must know terms like "lexical analysis".
Well, in a sense, you're right about that. OTOH, I do not want to
assume a very high level of theoretical understanding on the part of
people who show up. In particular, this kind of tool, it seems to be
mostly ph.D. academics who work on these projects and maintain them.
And, to be perhaps brutally honest, I think this is a factor that
makes these tools less generally useful in the real world than they
could be. One thing that continually surprises me is that you see a
lot of projects with hand-written parsers. And often, the people
involved look like they're pretty competent. I think there is a clear
phenomenon that parser generators are a technology that are rather
under-utilized out there compared to what they should be.
One part of this is because the documentation, its terminology, is
often so inaccessible to anybody who doesn't have that kind of
academic background, assumes far too much prior knowledge. The other
reason is that, I suspect that academics tend to be more interested in
demonstrating some theoretical innovation or other, rather than
producing (and maintaining) a very polished, practical sort of tool.
Anyway, my thinking on this leads me to think that there really is
some value in terms of having a parser generator project with a
relentlessly pragmatic, real world usage, sort of focus. Of course,
having a relentlessly pragmatic approach encompasses a whole set of
things, but I think one part of it is not assuming too much about the
prior knowledge of potential users.
> And I'm not a fan of "-izer" constructs when alternatives exist, so I prefer
> FooLexer.
Not withstanding the above, I think I'll stick with FooLexer. I also
renamed INCLUDE_GRAMMAR to INCLUDE, and INJECT_CODE to INJECT. I
actually had to tweak the FreeMarker grammar to adjust to these
changes, since there was a token named INCLUDE, which is now a
reserved word. The fix is simple enough. I changed it to _INCLUDE.
>
>>
>> Even XXXTokenSource is not too bad a name. TokenManager is a very odd
>> name because. aside from being such a mouthful, this object does not
>> in any sense "manage" the Tokens. It just spits them out is all.
>
> Agreed. Time for it to go.
>
>>
>> This
>> is all picky stuff, of course, but, you know, people have a hard time
>> as it is developing a conceptual model of how a tool like this works.
>> There is no need to make it worse by naming things in confusing ways.
>
> I think naming is quite important in any code that's going to be maintained
> over time and even more so of course in a public API. So I think you are
> right to put some effort into it and I appreciate your soliciting feedback.
> Smart move. (Or maby I think it's smart because I like the changes you are
> making. ;-)
>
>>
>> Oh, and feel free to opine on:
>>
>> INJECT_CODE
>> vs.
>> INJECT
>> vs.
>> INSERT
>
> I like INJECT. Adding "_CODE" seems redundant and implies the existents of
> other things to inject besides code. I prefer INJECT over INSERT as well,
> but can't say why other than it feels more appropriate.
I think that INJECT is what it will be then. These naming changes are
already in SVN, but will be in the next release.
>
>>
>> Do you see any particular need for the code to
>> compile with JDK version < 1.5?
>
> No. Now's the perfect time to cut the ties to pre-1.5. As you've said
> elsewhere 1.5 has been out for a very long time. Anyone needing to do
> maintainance on a pre-1.5 project can use JavaCC. Most people moving to
> FreeCC will be doing so on new projects. (Besides I want to put another
> arrow in the quivers of those engineers trying to convince their managers to
> allow them to move on. The language features introduced in 1.5 are
> important.)
Also, if you absolutely have to deploy on an older JVM, you can use
retrotranslator. It works quite well.
>
>>
>> > Also want to note that in the end we want FreeCC plugics for Eclipse
>> > *and*
>> > NetBeans.
>>
>> Yeah, sure, but I don't know when this will happen. Do you have any
>> experience writing NetBeans plugins?
>
> A little bit. Maybe this is something I can help with down the road a
> piece, after I'm out of my current coding frenzy. I'll want an NB plugin
> for my own use.
I think that Netbeans 6 has a new thing called Schliemann (apparently
Schliemann, the Austrian archaeologist who dug up Troy, knew one heck
of a lot of languages fluently, like learning a new one every few
years for the entire length of his adult life...) where, if you write
a grammar for whatever language -- in a syntax really quite similar to
JavaCC it turns out -- a lot of stuff can work pretty much immediately
-- syntax highlighting and code folding certainly and I can't recall
what else. And that's better than nothing definitely. To move beyond
that, you need to start learning the various Netbeans API's, I guess.
And that's bound to be a pretty significant intellectual investment.
But it seems
that for fairly little effort, just having a Schliemann file for
FreeCC could make Netbeans a fairly reasonable editor for editing
FreeCC files. Not super-duper, but the low-hanging fruit should be
picked first obviously. And of course, any editor that could
syntax-highlight etcetera FreeCC files would also work on JavaCC
files, since it's a superset basically...
Oh well, 'nuff said for now. Thanks again for the feedback.
JR
>
> All the best,
>
> Andy
>
> >
>
my thinking on this leads me to think that there really is
some value in terms of having a parser generator project with a
relentlessly pragmatic, real world usage, sort of focus.
Of course,
having a relentlessly pragmatic approach encompasses a whole set of
things, but I think one part of it is not assuming too much about the
prior knowledge of potential users.