Recursive parsing of TCL scripts

Skip to first unread message

Jan 10, 2017, 10:13:16 AM1/10/17
to jtcl-project
  I am trying to recursively parse a series of TCL scripts and capture every command in a Java object tree. I have been able to parse a script by repeatedly calling the "command" option of TclParser.cmdProc which returns the following information for each command:

<commentRange> <commandRange> <restRange> <parseTree>

where <parseTree> is defined as a list representation of the parse tree where each node is a list in the form:

<type> <range> <subTree>

Simple use case
On parsing a simple command such as "set idx 1", the <subTree> is a list containing 3 nodes where each node is of type "simple" (Node1 = "set", Node2 = "idx", Node3 = "1"). Each node has its own <subTree> containing a single node of type "text".

Slightly more involved use cases
When parsing commands that have more of a tree structure (such as while loops, "if" statements, etc) the <subTree> still only returns "simple" nodes, with each of these having again having a <subTree> containing a single "text" node.

The issue I have is that each "simple" node may contain multiple TCL commands (e.g. the entire body of an "if" statement), expressions (e.g. the expression of an "if" statement), etc. I haven't been able to work out how to continue to recursively parse through an entire TCL script to the point where I can build a Java object tree where the leaves of the tree are individual commands (without command arguments), command arguments, operands, etc.

How can I go beyond the first level of parsing using the "command" option? I cannot simply recursively call this as not all returned nodes are complete commands. Any direction on this is much appreciated.


Tom Poindexter

Jan 10, 2017, 11:27:26 PM1/10/17
to, jtcl-project
Hi Kat,

Parsing Tcl ahead of time with the parser extension does require
recursive parsing, and knowledge of which commands have arguments that
are themselves bodies of Tcl code.

You might have a look at TSP, my Tcl to Java (and Tcl to C!) compiler:

TSP uses the parser command for all of it's parsing. TSP makes the
assumption that core Tcl commands ('if', 'while', 'set', etc) have not
been redefined. That way, I can always assume which commands have
arguments that are a body of code that requires recursive paring.
Parsing Tcl command substitution ( e.g. ' set a [cmd x y]') requires
recursion, but the parser returns a parse tree node of 'command' for
this situation, so you know that it's a Tcl command body.

See the file tcl-parse.tcl, which parses a body of Tcl commands. As
commands are recognized by the compiler, it's up to each command that
may have recursion (if, while, for, etc) to parse those string
arguments as Tcl commands. Note the tcl-parse.tcl contains a lot of
code specific for TSP.

Even parsing single, non-control commands may require recursive
parsing to resolve array variables, backquoted characters, variable
substitution inside of strings, and the like. I have attached a small
program that simply prints the parse subtree for various Tcl syntax
constructs (at least all the ones I could think of). Run it as 'jtcl

Parsing expressions are another thing, see tcl-expr.tcl. Again, this
code is specific for TSP, literally producing equivalent expressions
in C and Java, again subject to assumed constraints.

Best regards,
> --
> You received this message because you are subscribed to the Google Groups
> "jtcl-project" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to
> To post to this group, send email to
> Visit this group at
> To view this discussion on the web visit
> For more options, visit
Reply all
Reply to author
0 new messages