[ANN] pigeon: a PEG parser generator for Go

460 views
Skip to first unread message

Martin Angers

unread,
Apr 13, 2015, 7:19:49 AM4/13/15
to golan...@googlegroups.com
I wrote a parser generator (a la yacc) based on PEGs for Go. It was inspired by the PEG.js project.


and the introductory blog post: http://0value.com/A-PEG-parser-generator-for-Go


Martin

Kurt Jung

unread,
Apr 14, 2015, 7:38:14 AM4/14/15
to golan...@googlegroups.com
Excellent work! It's really nice to see a PEG parser generator for Go. I hope to see it in the standard library someday.

-- Kurt

andrewc...@gmail.com

unread,
Apr 14, 2015, 4:33:29 PM4/14/15
to golan...@googlegroups.com
Very cool.

Why use a special arrow characte btw? Is there an alternative syntax for normal keyboards?

Martin Angers

unread,
Apr 14, 2015, 4:35:02 PM4/14/15
to andrewc...@gmail.com, golan...@googlegroups.com
oh sure, there are four alternatives:

'=', '<-' and the two left arrow unicode code points.

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/NyM3DEPtX94/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Peter Kleiweg

unread,
Apr 15, 2015, 7:31:37 AM4/15/15
to golan...@googlegroups.com
What happens when an expression return an error? Does that make the match fail, and the parser backtrack?

With 'and' and 'not' expressions, what is the meaning of boolean return value?

Martin Angers

unread,
Apr 15, 2015, 7:50:27 AM4/15/15
to golan...@googlegroups.com
No, an action code block is always called when a match succeeded, and cannot turn that into a match fail. Returning an error doesn't change that, it just adds the error to the list of errors encountered (that can be useful to return a specific error when you know for sure that backtracking cannot match anything else - that's what pigeon does in its grammar for unclosed literals, for example).

For & and !, the boolean decides whether it is a match or not (& is a match if it returns true, ! is a match if it returns false). This is like the non-code block version of & and ! operators.

Peter Kleiweg

unread,
Apr 15, 2015, 8:57:57 AM4/15/15
to golan...@googlegroups.com
How do I convert a labeled terminal to a string? Something like this:

    Term <- term:[a-z]+ space:[ \t\r\f\n]* {
         // termString is term as string
         return termString, nil
    }

Whatever I try, I get "interface conversion: interface is []interface {}". Why isn't it a []byte, just like c.text?

Peter Kleiweg

unread,
Apr 15, 2015, 9:03:36 AM4/15/15
to golan...@googlegroups.com
You might want to give all generated names for "internal" use a common, documenten prefix, to prevent name clashes with user defined names in future releases of your software.

Martin Angers

unread,
Apr 15, 2015, 10:09:50 AM4/15/15
to Peter Kleiweg, golan...@googlegroups.com
When the expression is a sequence (Term = label:('a' 'b' 'c')) or a repetition as in your case (Term = term:[a-z]+), the value is a []interface{} because the label holds multiple separate matches. `c.text`, on the other hand, is the whole match associated with the code block (e.g. it could be "abc\n\t" in your case, as  a []byte).

I agree that this is a confusing thing, I will try to better document this.

Good point for the prefix to avoid name clashes, I will think about that.

On Wed, 15 Apr 2015 at 09:03 Peter Kleiweg <pkle...@xs4all.nl> wrote:
You might want to give all generated names for "internal" use a common, documenten prefix, to prevent name clashes with user defined names in future releases of your software.

--
Reply all
Reply to author
Forward
0 new messages