Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[Caml-list] ocamlyacc, menhir, dypgen, camp4, elkhound

59 views
Skip to first unread message

Christophe Raffalli

unread,
Mar 16, 2007, 4:53:02 PM3/16/07
to caml...@inria.fr

Hi,

The programs listed in the subject of this mail are parser generators
for OCaml ...
There may be others, and there is also the possibility to write parsers
by hand using stream pattern matching, parser combinators, etc ...
(the later technics are not covered here)

It would be nice to have a small comparison table to help people making
a decision ? Being not neutral
(dypgen was developped by a student at ENS lyon under my direction) I
think I should not do it myself ...

But I gave it a try. Here is a first draft (This table is not correct
yet and should not be used to make a decision ;-)

http://www.lama.univ-savoie.fr/~raffalli/ocaml-parsing.html

The html produced by ooffice is poor and the calculator example is not
well presented due to an ooffice bug (I think).

Questions

- Can you help me complete the table (missing lines, columns,
inaccurate or missing information) ?
- Can you provide the best possible calculator example for the other
parser as a text file ?
- Can you provide a really difficult but short grammar to make a better
comparison ?
- Other Idea welcome
- If someone neutral wants to volonteer to host this comparison table, I
will be very please to send him the source of the table.

Christophe Raffalli

_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

Aleksey Nogin

unread,
Mar 17, 2007, 1:14:31 PM3/17/07
to Christophe Raffalli
Christophe,

I am not an expert in camlp4, but I probably know enough to be able to
fill in a few blanks in the last column of your table. My answers are
based on "old" 3.09 camlp4 and might be different for the new 3.10 one.

Handled grammar: LL(1)

Reentrant: Yes, but with caveats(*1)

Extensible (an action can change the grammar): Yes(*2)

Parametrized non terminal: No (???, *3)

Splitting the definition of a non terminal: Yes

Grammars parametrized by Ocaml modules: ??? (Yes? - *4)

Partial action: No

Ambiguous grammars: Yes (???, *5)

Exception to reject a rule: No (???)

Priority: Using levels (= a partial order as a relation) + associativity
direction

Debugging grammars: hard

---

(*1) In 3.09, reentrant parsing works, except the error messages might
end up pointing to the wrong place in the source stream (supposed to be
fixed in 3.10, AFAIK). However, there is no good way of passing a state
to the grammar actions. As a result, one often has to resort to using a
global ref to hold the state, which, obviously, kills the ability to be
reentrant.

(*2) For example, see the pa_macro where the action for the DEFINE and
UNDEF expressions change the grammar accordingly.

(*3) As far as I understand, only tokens
can be parameterized, although that might have changed in 3.10.

(*4) I am not sure which capability exactly is meant here. In Camlp4,
the grammar definitions/extensions can be done inside a body of a
functor. The actual grammar is created when the functor is applied.

(*5) Not sure what is meant here. In camlp4, unfortunately, there is no
way to test the grammar for ambiguities - it accepts an ambiguous
grammar without any warnings.

Aleksey

Tom

unread,
Mar 17, 2007, 6:22:36 PM3/17/07
to Christophe Raffalli
Possibly a difficult grammar is the grammar of OCaml value definition...

structure_item:
LET rec_flag let_bindings
;
rec_flag:
/* empty */
| REC
;
val_ident:
LIDENT
| LPAREN operator RPAREN
;
let_bindings:
let_binding
| let_bindings AND let_binding
;
let_binding:
val_ident fun_binding
| pattern EQUAL seq_expr
;
fun_binding:
strict_binding
| type_constraint EQUAL seq_expr
;
strict_binding:
EQUAL seq_expr
| labeled_simple_pattern fun_binding
;

Indeed there are some "unknown" items, but the difficult part here is
trying to parse the function declaration with
variable number of parameters, using no (slow) recursion.

Hm... maybe, I don't know, I found this both hard to think of and hard
to understand.

- Tom

0 new messages