Strategy for Simple bbcode Translator

108 views
Skip to first unread message

Steve Ross

unread,
Jun 27, 2012, 5:13:09 PM6/27/12
to pe...@googlegroups.com
I'm a total n00b, so pardon the naivety of my question. I have a grammar that is working ok in pegjs, but I'm trying to work out a number of things. The first is to decide whether to just get a raw parse tree and then deal with emitting the HTML from the calling code.

Here's my concern: basic emitters as actions are context aware. When traversing the parse tree, I don't see that you are context aware, so emitting inline might be better (or not? that's my question.)

As an example:

document
   = (uri_tag / email_tag / open_tag / close_tag / new_line / text)*

email_tag
   = ("[email" "=" l:email_identifier "]")
     { return 'a mailto:"' + l.join('') + '">' + l.join() + '</a>'; }
   / ("[email=" l:email_identifier "]" r:text "[/email]")
     { return '<a mailto:"' + l + '">' + r.join('') + '</a>'; }

This emits HTML and is aware of the special cases for the variants on how email can be specified in (loathsome) bbcode. I'm suspecting that the parser is meant to be just that, and actions just sugar. That the emitter was meant to happen from client code. But I can't find any examples of that.

Please excuse the long and wandering question.

Thanks,

Steve

David Majda

unread,
Jul 29, 2012, 10:29:51 AM7/29/12
to sxr...@gmail.com, pe...@googlegroups.com
Hi!

2012/6/27 Steve Ross <sxr...@gmail.com>:
Emitting desired output directly from parsing actions is a valid
technique. It is useful mainly for small and simple parsers where
processing some kind of parser output in a separate piece of code
would just add needless complexity.

Generating an abstract input representation and generating the output
from it as a separate step outside the parser is more suitable for
bigger parser or parsers that require more processing/transformations
(and thus more code). In such cases, separating the concerns can help
readability and maintainability of the code. Moreover, you can write
the output-emitting code in pure JavaScript so you can use all the
features your editor/IDE provides for it that may not be available for
PEG.js grammars.

Note that I wouldn't recommend using PEG.js's native tree format as an
input to output generation code. It is much better to create some more
abstract representation, e.g. a tree of JavaScript objects. Look at
the "examples" directory in PEG.js how such parsers can look like.

--
David Majda
Entropy fighter
http://majda.cz/
Reply all
Reply to author
Forward
0 new messages