I'm working on a similar problem and this is the first thread I've seen discussing it. Where I wound up going with my grammar was to match things and construct an AST by assigning objects. So, I'd have a tree like:
1 type: open_tag, value: color
2 type: text, value: 'textex'
2 type: open_tag, value: 'italic'
2 type: text, value: 'ttext'
1 type: close_tag, value: 'italic'
1 type: open_tag, value: 'italic'
2: type: text, value: 'ttext'
1 type: close_tag, value: 'italic'
0 type: text, value: 'ext " is better'
With that syntax tree, I'm able to create an emitter from the AST instead of from the parser. I don't know if my approach is a good one or a bonehead one, but it relies on writing each nonterminal rule like this:
document
(text / tagged_text)*
tagged_text
open_tag text close_tag
open_tag
"<" t:[a-zA-Z0-9]+ ( /)? ">" { return { type: 'open_tag', value: t }; }
close_tag
"</" t:[a-zA-Z0-9]+ ">"
Obviously, this doesn't handle matching begin and end tags, so the grammar I've spelled out only works if the tags are sensibly nested. Still, you could emit stuff from that tree, I guess.
I'm trying to present an alternative solution while at the same time asking whether mine makes any sense. Also, I've noticed that using pegjs, there is no such thing as a partially ok parse, so the syntax matching is strict. Either the input text matches the grammar or the tree is not populated and the exception is thrown. This is not how browsers work -- they try to do something sensible (or at least they try not to break horribly) on a syntax error.
Hope this helps and sorry for jumping in with my own junk here.
Steve