Re: [Google Groups] Question about tree structure

Dave Dolan

unread,

Oct 1, 2012, 3:11:12 PM10/1/12

to gold-pars...@googlegroups.com

This is the expected behavior for the engine. What you probably want though is the "trim reductions" option. That will eliminate the unnecessary non-terminal nodes from the tree. There is an icon for this in the tool bar somewhere I believe.

On Oct 1, 2012 2:49 PM, "Darkcobra" <darkc...@gmail.com> wrote:

Let me start off by saying that I've never used anything like this before. I figured I'd start with something simple, parsing some code into a viewable tree. So I downloaded:

1) The "Visual Basic Parse Tree" project from the Engines category.
2) The "Visual Basic.NET" grammar.
3) And the "GOLD Parser Builder", to recompile the grammar (since the engine wouldn't accept the precompiled grammar).

Then I fed the engine some simple test code:

module module1
sub main()
a=2*3
end sub
end module

Looking at the output, the tree structure is intelligible, but there appears to be an incredible amount of useless information. The following clip of the tree for the line "a=2*3" is an excellent example. Useful nodes that show what a token is, are nested deep within a majority of seemingly useless nodes, showing what the token isn't. As a complete newbie, I'm not sure what to make of this. Is there a flaw in the grammar or engine? Or is this expected behavior from a parser?

| | | | | | +-<Statements> ::= <Statement> <Statements>
| | | | | | | +-<Statement> ::= <Non-Block Stm> <NL>
| | | | | | | | +-<Non-Block Stm> ::= <Variable> <Assign Op> <Expression>

| | | | | | | | | +-<Variable> ::= <Identifier> <Argument List Opt> <Method Calls>
| | | | | | | | | | +-<Identifier> ::= ID
| | | | | | | | | | | +-a
| | | | | | | | | | +-<Argument List Opt> ::=

| | | | | | | | | | +-<Method Calls> ::=
| | | | | | | | | +-<Assign Op> ::= '='
| | | | | | | | | | +-=
| | | | | | | | | +-<Expression> ::= <And Exp>

| | | | | | | | | | +-<And Exp> ::= <Not Exp>
| | | | | | | | | | | +-<Not Exp> ::= <Compare Exp>
| | | | | | | | | | | | +-<Compare Exp> ::= <Shift Exp>
| | | | | | | | | | | | | +-<Shift Exp> ::= <Concat Exp>
| | | | | | | | | | | | | | +-<Concat Exp> ::= <Add Exp>
| | | | | | | | | | | | | | | +-<Add Exp> ::= <Modulus Exp>

| | | | | | | | | | | | | | | | +-<Modulus Exp> ::= <Int Div Exp>
| | | | | | | | | | | | | | | | | +-<Int Div Exp> ::= <Mult Exp>
| | | | | | | | | | | | | | | | | | +-<Mult Exp> ::= <Negate Exp> '*' <Mult Exp>

| | | | | | | | | | | | | | | | | | | +-<Negate Exp> ::= <Power Exp>
| | | | | | | | | | | | | | | | | | | | +-<Power Exp> ::= <Value>
| | | | | | | | | | | | | | | | | | | | | +-<Value> ::= IntLiteral

| | | | | | | | | | | | | | | | | | | | | | +-2
| | | | | | | | | | | | | | | | | | | +-*
| | | | | | | | | | | | | | | | | | | +-<Mult Exp> ::= <Negate Exp>
| | | | | | | | | | | | | | | | | | | | +-<Negate Exp> ::= <Power Exp>

| | | | | | | | | | | | | | | | | | | | | +-<Power Exp> ::= <Value>
| | | | | | | | | | | | | | | | | | | | | | +-<Value> ::= IntLiteral
| | | | | | | | | | | | | | | | | | | | | | | +-3

| | | | | | | | +-<NL> ::= NewLine
| | | | | | | | | +-

--
You received this message because you are subscribed to the Google Groups "GOLD Parsing System" group.
To view this discussion on the web visit https://groups.google.com/d/msg/gold-parsing-system/-/NhGloj05M94J.
To post to this group, send email to gold-pars...@googlegroups.com.
To unsubscribe from this group, send email to gold-parsing-sy...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/gold-parsing-system?hl=en.

Peter Frazer

unread,

Oct 1, 2012, 4:05:22 PM10/1/12

to gold-pars...@googlegroups.com

Hello Darkcobra,

If you are only interested in the sequence of tokens then there is indeed 'a lot of seemingly useless nodes'. The tokens or lexemes are the string of terminal symbols from the input and producing them is the job of a lexer or scanner. What the 'seemingly useless nodes' give you is the structure of the input in terms of the constructs of the grammar. It is this information that is usually crucial in determining how to process it further.

Suppose you want to evaluate the expression. If there is a (possibly complex) operator precedence in the grammar then knowing the structure is going to make it a whole lot easier than just having a sequence of tokens. For example, the structure makes it possible to construct a correct sequence of binary operators that can bu pushed and popped on a stack to evaluate the expression.

Sometimes it can be the case that there are layers in the grammar that you do not require. In this case it is very easy to write a boolean skipNode(token) function that compares the token with a list of those you wish to skip.

I am also fairly new to this so someone else may have a better reply.

Peter.

To unsubscribe from this group, send email to gold-parsing-system+unsub...@googlegroups.com.

Darkcobra

unread,

Oct 1, 2012, 5:55:09 PM10/1/12

to gold-pars...@googlegroups.com

That did the trick, thank you!

Darkcobra

unread,

Oct 1, 2012, 6:23:53 PM10/1/12

to gold-pars...@googlegroups.com

The thing that was confusing me is that it generated a structure which represented the precedence of not only the operators used in an expression, but also many that weren't used. The "Trim Reductions" option as suggested above cleared that up.

But your explanation was clear, and provided me with some other useful information. Thanks!

Reply all

Reply to author

Forward