Tree representations of programs (Was... Anybody there?)

ira_d._baxter_[idbaxter@semdesigns.com]

unread,

Jan 23, 2001, 4:43:03 PM1/23/01

to comp-lan...@moderators.isc.org

Tree representations (as data structures) of programs are good because they
organize the program
according the structure of the programming language phrases.
(This far better than organizing the program according to text characters).

This allows a program to be manipulated in a manner which is consequence
of the language syntax; i.e., you can swap one statement with its follower
easily and relatiably when you represent them as trees, but you cannot do
that
if the program is represented as text.

Having said that, how something is represented is really a function of
what you want to do with it. If you want to compute information
flows in a program, a data flow representation of it is far better
than a tree (is better than the text). An interesting point is,
that most "advanced" representations (flow graphs) practically
require to you build the tree representation anyway, because
you need the basic language structure to figure what things
flow between :-}

Check the compiler textbooks; the Aho&Ullman "Dragon" book on compilers
is really pretty good.

You might be interested in the DMS Reengineering Toolkit,
which parses many langauges to trees, and then provides tools
to easily analyze/modify/port those trees;
the results can be prettyprinted. We like to think of DMS
as being the "ultimate" tool for handling tree representations
(and others) for programs.
See http://www.semdesigns.com/Products/DMS/DMSToolkit.html.

Since you asked this question in comp.lang.visual, I should address
the "visual" aspect. I think there is very *little* value is showing
such trees to people, except perhaps to educate them as
to what the language rules are. After all, the trees are
usually derived from a perfectly good text-based language :-}

Arguably, there is some utility of graphical trees to represent expressions.
But I think they tend to take up a lot of screen real estate,
and the text equivalent is generally far more compact.
YMMV.

--
Ira D. Baxter, Ph.D.,CTO email: idba...@semdesigns.com
Semantic Designs, Inc. web: http://www.semdesigns.com
12636 Research Blvd. C-214 voice: (512) 250-1018 x140
Austin, TX 78759-2200 fax: (512) 250-1191

<Daniel Goodburn [good...@ozemail.com.au]> wrote in message
news:94jvh7$gfo$1...@xsdev1.blackrock.com...
> Hi,
>
> Although no one seems to use this newsgroup it seems to be the most
> appropriately named group for my question so here goes.
>
> I'm looking for references that discusses the pros and cons of tree
> representations of programs. Why are trees good for expressing programs or
> why are they used so generally? It seems like a question that would have
> been answered ages ago, but I can't find any references that explain this
> straightout.
>
> Regards,
>
> Daniel
>

paul_morrison_[paul.morrison@home.com]

unread,

Jan 24, 2001, 5:00:22 PM1/24/01

to comp-lan...@moderators.isc.org

Ira, D., Baxter, [idba...@semdesigns.com] wrote:

> If you want to compute information
> flows in a program, a data flow representation of it is far better
> than a tree (is better than the text). An interesting point is,
> that most "advanced" representations (flow graphs) practically
> require to you build the tree representation anyway, because
> you need the basic language structure to figure what things
> flow between :-}
>

Sorry, I don't understand this bit - I have been using data flows for 30 years
(see my home page currently at http://paulmorrison.affordablehost.com/fbp/ and
we only use tree structures _within_ individual components - where they are
indeed very appropriate. You could also represent certain data structures
passing across the connections as trees. You cannot build a tree representation
of a true data flow - but maybe I misunderstood :-)

ira_d._baxter_[idbaxter@semdesigns.com]

unread,

Jan 25, 2001, 11:37:56 AM1/25/01

to comp-lan...@moderators.isc.org

<Paul Morrison [paul.m...@home.com]> wrote in message
news:94nj9m$on8$1...@xsdev1.blackrock.com...

Agreed. I think you'll agree, that to get data flows, you typically need
to build a tree (in essence, parse
the program), and then compute flow information from that.
I did not mean to imply that the trees themselves represent flows.

-- IDB

paul_morrison_[paul.morrison@home.com]

unread,

Jan 26, 2001, 7:50:49 AM1/26/01

to comp-lan...@moderators.isc.org

Ira, D., Baxter, [idba...@semdesigns.com] wrote:

> I think you'll agree, that to get data flows, you typically need
> to build a tree (in essence, parse
> the program), and then compute flow information from that.

Would you say that this approach is used more in heavily computational
applications? Our work has mostly been in the business area, where we
feel it is very natural to express the logic in terms of transformations
applied to data streams. So our problem has almost been the reverse: in
the absence of "engines" that think that way, applications had to be
converted from that type of thinking to single-threaded logic (which
normally implies a tree structure).
Michael Jackson talks about this in his book - IFIRC it entails turning
all the components inside-out except one, which becomes the main-line or
root of the tree (shudder!). Now we have a few such engines (the
latest being written in Java), so this very awkward transformation is no
longer necessary.

ira_d._baxter_[idbaxter@semdesigns.com]

unread,

Jan 26, 2001, 10:12:51 AM1/26/01

to comp-lan...@moderators.isc.org

<Paul Morrison [paul.m...@home.com]> wrote in message

news:94rrr9$m32$1...@xserv1.blackrock.com...

> Ira, D., Baxter, [idba...@semdesigns.com] wrote:
>
> > I think you'll agree, that to get data flows, you typically need
> > to build a tree (in essence, parse
> > the program), and then compute flow information from that.
>
> Would you say that this approach is used more in heavily computational
> applications? Our work has mostly been in the business area, where we
> feel it is very natural to express the logic in terms of transformations
> applied to data streams. So our problem has almost been the reverse: in
> the absence of "engines" that think that way, applications had to be
> converted from that type of thinking to single-threaded logic (which
> normally implies a tree structure).

If you start with the assumption of data flows as your basic primitive,
and you compose them with operators, then I suppose data flow
information is nearly free. (And many visual languages represent
these flows as visual "expression trees").
Parsing isnt really necessary in this case.

I was thinking (and mostly deal with) of non-visual langauges.
For "business", that would include COBOL (hiss, boo, oooodles of code,
...hmm)
For getting data flow from COBOL (or any other procedural langauge,
regardless of whether you think it is heavily computational or not),
you simply have to start by parsing it. Worse, you have
compute flows in individual programs, then you have to parse
the JCL that glues a series of COBOL programs together
into an application system, and integrate the COBOL and JCL
flows.

When you done, you can display the result visually.

-- IDB