Tree reading

Martin DeMello

unread,

Jun 12, 2006, 5:39:27 AM6/12/06

to

The problem is simple enough - I want to represent a tree in the form
[a, b, [c, d, [e, f], [g, h], [i, j]], [k, l]]

where any given array can contain a sequence of atoms followed by a
sequence of arrays, corresponding to nodedata and children respectively.
However, the code to read it got a bit ugly - can anyone improve on
this:

def import_tree_at(parent, tree)
ary = tree.find {|i| Array === i}
p = ary ? tree.index(ary) : nil
if p.nil?
tree.each_with_index {|d, i| parent[i] = d}
else
tree[0...p].each_with_index {|d, i| parent[i] = d}
tree[p..-1].each {|d|
node = @store.append(parent)
import_tree_at(node, d) if Array === d
}
end
end

(If nothing else, I think it shows the need for an Array#index {|i|
pred?(i)}) to replace the index(find {pred}) double-pass.)

martin

Robert Klemme

unread,

Jun 12, 2006, 8:28:04 AM6/12/06

to

I don't fully understand what your code does. I'm especially wondering
where from it gets the input. I'd probably create a proper tree class
with references to parent and children and do the import on top of that.
If you need to, you can still create nested arrays from that.

14:21:23 [~]: irbs -r enumerator
>> %w{aa bb cc}.to_enum(:each_with_index).find {|a,i| a == "bb"}
=> ["bb", 1]

Thusly

>> el, idx = %w{aa bb cc}.to_enum(:each_with_index).find {|a,i| a == "bb"}
=> ["bb", 1]
>> el
=> "bb"
>> idx
=> 1

Cheers

robert

Martin DeMello

unread,

Jun 12, 2006, 9:35:26 AM6/12/06

to

Robert Klemme <bob....@gmx.net> wrote:

> Martin DeMello wrote:
> >
> > def import_tree_at(parent, tree)
> > ary = tree.find {|i| Array === i}
> > p = ary ? tree.index(ary) : nil
> > if p.nil?
> > tree.each_with_index {|d, i| parent[i] = d}
> > else
> > tree[0...p].each_with_index {|d, i| parent[i] = d}
> > tree[p..-1].each {|d|
> > node = @store.append(parent)
> > import_tree_at(node, d) if Array === d
> > }
> > end
> > end
>

> I don't fully understand what your code does. I'm especially wondering
> where from it gets the input.

It's from a Gtk::SimpleTree class I'm writing; the point is to have the
input in as lightweight a form as possible. Here's some sample client
code:

# define the tree
s = Gtk::SimpleTree.new([String, String, Integer],
["First Name", 0],
["Last Name", 1,
{:weight => Pango::FontDescription::WEIGHT_BOLD}],
["Age", format_age, {:foreground => "red"}],
["Age", simple_age])

# define the initial data
treedata = [
['Maria', 'Incognito'],
['Jane', 'Average', 1962,
['Janinita', 'Average', 1985]]]

s.import_tree(treedata)

> I'd probably create a proper tree class
> with references to parent and children and do the import on top of that.
> If you need to, you can still create nested arrays from that.

The point is to enter the initial tree inline; there's already a proper
tree class at the back end.

> >> el, idx = %w{aa bb cc}.to_enum(:each_with_index).find {|a,i| a == "bb"}
> => ["bb", 1]
> >> el
> => "bb"
> >> idx
> => 1

Ah, very nice! I keep forgetting about the new stuff from enumerator.

martin

Robert Klemme

unread,

Jun 12, 2006, 3:55:07 PM6/12/06

to

Ok, I though you were going to parse the tree from some representation.
If you still feel your code is awkward you can still create a helper
class for the import process. Just an idea. The one thing that struck
me odd about your code was this array referencing and indexing. But
that's probably necessary because your input is already a tree.

Kind regards

robert

Martin DeMello

unread,

Jun 12, 2006, 5:07:49 PM6/12/06

to

Robert Klemme <bob....@gmx.net> wrote:
>
> Ok, I though you were going to parse the tree from some representation.
> If you still feel your code is awkward you can still create a helper
> class for the import process. Just an idea. The one thing that struck
> me odd about your code was this array referencing and indexing. But
> that's probably necessary because your input is already a tree.

The awkwardness is specifically this: in C what I'd do is scan through
the array element by element, checking that the element was not an array
and adding it to the node's propertes. If I find an array, I set the
reading_children flag, and continue iterating, passing every array to
the recursive call and ignoring everything that isn't an array (just for
some extra safety). It's nice because it only involves a single pass
through the array.

I can do the same thing in ruby, but only by essentially writing C code
with a ruby syntax, conversely, attempts to write it in clean, idiomatic
ruby keep adding hidden constants to the running time (probably doesn't
matter in practice, but it's unsatisfying). What I *want* to write is

ix = ary.while {|i| !(Array === i)} {|i| add i to node data}
ary[ix..-1].select {|i| Array === i} {|i| recursively call self with i}

Which would be possible if methods could take more than one block - I
want one block for the predicate and one to be yielded to if the
predicate returns true. I could still approximate it with enumerators or
lambdas, I guess, but it feels weird to write Array#while and rewrite
Array#select and then only use them once.

martin

Message has been deleted

Jens Auer

unread,

Jun 13, 2006, 4:11:06 AM6/13/06

to

Robert Klemme wrote:
> Ok, I though you were going to parse the tree from some representation.
> If you still feel your code is awkward you can still create a helper
> class for the import process. Just an idea.

In case you want to build a helper class, you should have a look at the
Builder pattern (Gamma et al.). I would prefer it to construct the data
structure.

Robert Klemme

unread,

Jun 13, 2006, 7:33:31 AM6/13/06

to

Martin DeMello wrote:
> Robert Klemme <bob....@gmx.net> wrote:
>> Ok, I though you were going to parse the tree from some representation.
>> If you still feel your code is awkward you can still create a helper
>> class for the import process. Just an idea. The one thing that struck
>> me odd about your code was this array referencing and indexing. But
>> that's probably necessary because your input is already a tree.
>
> The awkwardness is specifically this: in C what I'd do is scan through
> the array element by element, checking that the element was not an array
> and adding it to the node's propertes. If I find an array, I set the
> reading_children flag, and continue iterating, passing every array to
> the recursive call and ignoring everything that isn't an array (just for
> some extra safety). It's nice because it only involves a single pass
> through the array.
>
> I can do the same thing in ruby, but only by essentially writing C code
> with a ruby syntax, conversely, attempts to write it in clean, idiomatic
> ruby keep adding hidden constants to the running time (probably doesn't
> matter in practice, but it's unsatisfying). What I *want* to write is
>
> ix = ary.while {|i| !(Array === i)} {|i| add i to node data}
> ary[ix..-1].select {|i| Array === i} {|i| recursively call self with i}

Why can't you just do this?

def parse(node, arr)
arr.each do |elem|
if Enumerable === elem
node.add_child_node( parse( Node.new( node ), elem ) )
else
node.add_child_data( elem )
end
end
node
end

> Which would be possible if methods could take more than one block - I
> want one block for the predicate and one to be yielded to if the
> predicate returns true. I could still approximate it with enumerators or
> lambdas, I guess, but it feels weird to write Array#while and rewrite
> Array#select and then only use them once.

I guess I'm overlooking something probably because I don't know the the
tree implementation you are using.

Kind regards

robert

Martin DeMello

unread,

Jun 14, 2006, 1:14:45 AM6/14/06

to

Robert Klemme <bob....@gmx.net> wrote:
> Martin DeMello wrote:

> > matter in practice, but it's unsatisfying). What I *want* to write is
> >
> > ix = ary.while {|i| !(Array === i)} {|i| add i to node data}
> > ary[ix..-1].select {|i| Array === i} {|i| recursively call self with i}
>
> Why can't you just do this?
>
> def parse(node, arr)
> arr.each do |elem|
> if Enumerable === elem

String includes Enumerable :(

> node.add_child_node( parse( Node.new( node ), elem ) )
> else
> node.add_child_data( elem )
> end
> end
> node
> end

But yeah, that's essentially what I ended up using, with a flag to note
that once you see the first child, you stop accepting node data. I just
disike having an entire loop body wrapped in a switch.

> I guess I'm overlooking something probably because I don't know the the
> tree implementation you are using.

It's not an implementation problem, just the fact that 'flagged'
behaviour in a loop is so ugly to code, and I was wondering if people
had come up with better ideas.

Note that trying to design a nice Array#while (or #take_while if you
prefer) really does present an interface design problem - either
1. you want a block for the predicate and a block to yield values to, or
2. you want to return the collection and the index at which it stopped

#1 can be got around by using foo(lambda {}) {} and #2 by returning the
collection and letting the user perform the extra step of retval.length
to see where it stopped, but both are (in the strictest sense of the
word) less than ideal.

martin

Martin DeMello

unread,

Jun 14, 2006, 1:28:16 AM6/14/06

to

billmcn <bil...@gmail.com> wrote:
> Check out http://rubyforge.org/projects/treebank/.

Sounded interesting, but all there was was a gem, and it died with
Install required dependency fsa? [Yn] y
ERROR: While executing gem ... (Gem::GemNotFoundException)
Could not find fsa (> 0.0.0) in the repository

martin

Robert Klemme

unread,

Jun 15, 2006, 8:58:58 AM6/15/06

to

Martin DeMello <martin...@yahoo.com> wrote:
> Robert Klemme <bob....@gmx.net> wrote:
>> Martin DeMello wrote:
>>> matter in practice, but it's unsatisfying). What I *want* to write
>>> is
>>>
>>> ix = ary.while {|i| !(Array === i)} {|i| add i to node data}
>>> ary[ix..-1].select {|i| Array === i} {|i| recursively call self
>>> with i}
>>
>> Why can't you just do this?
>>
>> def parse(node, arr)
>> arr.each do |elem|
>> if Enumerable === elem
>
> String includes Enumerable :(

Oh yes, my bad.

>> node.add_child_node( parse( Node.new( node ), elem ) )
>> else
>> node.add_child_data( elem )
>> end
>> end
>> node
>> end
>
> But yeah, that's essentially what I ended up using, with a flag to
> note that once you see the first child, you stop accepting node data.
> I just disike having an entire loop body wrapped in a switch.
>
>> I guess I'm overlooking something probably because I don't know the
>> the tree implementation you are using.
>
> It's not an implementation problem, just the fact that 'flagged'
> behaviour in a loop is so ugly to code, and I was wondering if people
> had come up with better ideas.

I don't find the solution above ugly. I'm not sure why you find it ugly. I
rather dislike the solution where you have to do array indexing because it's
less efficient.

> Note that trying to design a nice Array#while (or #take_while if you
> prefer) really does present an interface design problem - either
> 1. you want a block for the predicate and a block to yield values to,
> or

I'm not sure what exactly you want to achieve with Enumerable#while, but you
know that you can stop an iteration with "break" do you?

>> %w{aa bb cc dd}.each {|w| p w; break if /^b+$/ =~ w}
"aa"
"bb"
=> nil

Note that you can rewrite the method from your first posting simpler and
more efficiently as

def import_tree_at(parent, tree)
idx = tree.each_with_index {|e,i| break i if Array === e}
idx = tree.size unless Numeric === idx
tree.each_with_index do |d, i|
if i < idx

parent[i] = d
else

node = @store.append(parent)
import_tree_at(node, d) if Array === d
end
end

end

> 2. you want to return the collection and the index at which it stopped
>
> #1 can be got around by using foo(lambda {}) {} and #2 by returning
> the collection and letting the user perform the extra step of
> retval.length to see where it stopped, but both are (in the strictest
> sense of the word) less than ideal.

There is an easier and more elegant solution for #2: return multiple values.

coll, pos = arr.whatever

Kind regards

robert

Martin DeMello

unread,

Jun 21, 2006, 3:23:14 PM6/21/06

to

Robert Klemme <bob....@gmx.net> wrote:

> Martin DeMello <martin...@yahoo.com> wrote:
> >
> > It's not an implementation problem, just the fact that 'flagged'
> > behaviour in a loop is so ugly to code, and I was wondering if people
> > had come up with better ideas.
>
> I don't find the solution above ugly. I'm not sure why you find it ugly. I
> rather dislike the solution where you have to do array indexing because it's
> less efficient.

I guess it's just a matter of aesthetics. If I have a loop the body of
which is enclosed in an if/then or switch statement, I prefer if at all
possible to render the branching code invisible, either by pulling it
upwards into the method running the loop, or doing some sort of
polymorphic dispatching within the block, or whatever.

martin

Robert Klemme

unread,

Jun 21, 2006, 4:53:05 PM6/21/06

to

Hm, that really sounds aesthetic to me. The situation where I agree is to
try to pull decisions out of the code that are the same for every iteration
because it's usually more efficient to have two loops with differing code
instead of one with an if then else.

Thanks for explaining!

Kind regards

robert

Martin DeMello

unread,

Jun 23, 2006, 5:02:24 AM6/23/06

to

Robert Klemme <bob....@gmx.net> wrote:
>
> Hm, that really sounds aesthetic to me. The situation where I agree is to
> try to pull decisions out of the code that are the same for every iteration
> because it's usually more efficient to have two loops with differing code
> instead of one with an if then else.

That was my initial motivation, especially for code like

flag = false
each {
if flag
do stuff
else
do other stuff
check condition
set flag
end
}

but now i just don't like it at all :)

martin