vnodes, tnodes, and unknownAttributes

11 views
Skip to first unread message

Terry Brown

unread,
Feb 25, 2008, 6:14:49 PM2/25/08
to leo-e...@googlegroups.com
As always, I'm thinking about cleo, for which there are some
overdue improvements that I might get around to tackling when the bzr
system is set up.

But I'm wondering if there are broader issues here, and if a general
rather than cleo specific solution might be useful.

unknownAttributes (uAs) can be bound to vnodes and / or tnodes, it's up
to the application (script, plug-in, whatever) to decide which location
to use / search. So if vnodes A and B are 'clones' of tnode C, there
are three different places to store/retrieve uAs.

Clones are often used to make indexes or short lists of nodes that are
spread around in other places, a common example being an "active tasks"
node containing clones of nodes on which work is currently being done.

What I struggle with is that some attributes should be universal for
the (t)node in any context, and others only make sense in some contexts.
For example you might color code the backgrounds of cloned nodes in an
active task list to indicate urgency. But when you're looking at the
node in it's "primary" location you don't want to be distracted with
"why is that one node pink?" type noise.

Thinking about it I suspect there's no general solution, these are
application specific issues.

A complicated solution might be for cleo or whatever to try and
interpret the context in which the node's displayed, and selectively
use / ignore attributes based on that.

A simpler solution might involve, when changing uAs on a vnode,
iterating the other clones and updating common attributes without
affecting attributes which aren't shared.

NOTE TO SELF: updates could be on in common keys or restricted to
in common key/value pairs.

Maybe I'll write code for the last approach, and maybe that could move
into the core as it's general rather than cleo specific.

Cheers -Terry

derwisch

unread,
Feb 26, 2008, 7:16:31 AM2/26/08
to leo-editor


On 26 Feb., 00:14, Terry Brown <terry_n_br...@yahoo.com> wrote:
> A simpler solution might involve, when changing uAs on a vnode,
> iterating the other clones and updating common attributes without
> affecting attributes which aren't shared.

What's the situation at the moment then? Do changes on vnodes
attributes propagate to all clones or not? I was under the impression
that not, but I may be mistaken (in which case an ugly workaraound for
my problem at hand is needed).

Edward K. Ream

unread,
Feb 26, 2008, 9:16:02 AM2/26/08
to leo-e...@googlegroups.com
On Mon, Feb 25, 2008 at 5:14 PM, Terry Brown <terry_...@yahoo.com> wrote:

What I struggle with is that some attributes should be universal for
the (t)node in any context, and others only make sense in some contexts.

I think about this issue as follows. Tnodes are the data (model), vnodes are the visual representation of the data.  So attach uA's to tnodes unless the *actual position of the node on the screen* is important, in which attach uA's to vnodes.

BTW, vnodes used to mean "visual" nodes.  They don't really mean that exactly any move now that tnodes are the root of shared subtrees, but the general idea remains.

Edward

Edward K. Ream

unread,
Feb 26, 2008, 9:28:31 AM2/26/08
to leo-e...@googlegroups.com
On Tue, Feb 26, 2008 at 6:16 AM, derwisch <johannes...@med.uni-heidelberg.de> wrote:
 
Do changes on vnodes attributes propagate to all clones or not?

Let's be clear: attributes *never* propagate.  tnode attributes are shared by all the cloned (v)nodes v1 and v2 (v1 != v2) such that v1.t == v2.t.

To repeat: vnodes represent nodes *on the screen*.  For any attribute x, there is no necessary relationship between v1.x and v2.x if v1 != v2.  The one and only exception is that v1.t == v2.t (v1 != v2) if v1 and v2 are clones of each other.

The situation is complicated.  Because subtrees of cloned vnodes are shared, it can often be the case that v1 == v2 even though v1 and v2 **appear** to be different nodes on the screen.  For example, suppose the tree looks like

a(1)
 - b
  -c
a(2)
 - b
  -c

Here a(1) and a(2) are clones of each other, so if v1 and v2 represent a(1) and a(2) we have v1 != v2 but v1.t == v2.t.  Furhtermore v1.t._firstChild == vb, the vnode representing the 'b' node.  Thus, v2.t._firstChild == vb, because v1.t == v2.t.

Thus, all the 'b' nodes are the *same* vnode, as are all the 'c' nodes.  In the old terminology (before shared subtrees), all the 'b' nodes were said to be 'joined' to each other, and all the 'c' nodes were joined to each other.  In the new (since Leo 4.2) shared subtree world, the 'joined' terminology isn't edifying: 'joined' nodes are the *same* node.

Edward

Terry Brown

unread,
Feb 26, 2008, 10:03:28 AM2/26/08
to leo-e...@googlegroups.com
On Tue, 26 Feb 2008 08:28:31 -0600
"Edward K. Ream" <edre...@gmail.com> wrote:

> To repeat: vnodes represent nodes *on the screen*.

This seems like something of an oversimplification to me. The topology
of vnodes defines the context in which your referencing a tnode.
Consider:

Urgent fixes
update_main
delete_all(1)
init_stage2
...
...
class Fred
__init__
delete_all(2)
andSoOn

So the delete_all nodes are clones. One's a reference to the
delete_all code as something needing urgently fixing, the other's a
reference to the delete_all code as a member of the Fred class. The
different contexts are more than just where they are on the screen.

So you might want delete_all(1) color coded to display urgency, but not
see that color at delete_all(2), which is fine, uAs on vnodes work that
way. But there are probably attributes you want present on all
references to delete_all, fine, put them on the tnode, and maybe some
attributes you want on just a subset of references to delete_all, which
means iterating the list of clones and updating appropriately. Seems
to me the current implementation gives plenty of flexibility.

Shoot, oversimplification is 3 letters to long to be a great Scrabble
word :-)

I've attached a diagram of structure I made some time ago, it's not
new, I think it was correct. The pink things are vnodes, the others
tnodes, BTW. I find myself wondering what would happen if vnodes had
their own list of 'children' rather than taking them from their tnode.
The simple answer, of course, is that everything would break :-)

Cheers -Terry

leoStruct.png

Edward K. Ream

unread,
Feb 26, 2008, 1:16:52 PM2/26/08
to leo-e...@googlegroups.com
On Tue, Feb 26, 2008 at 9:03 AM, Terry Brown <terry_...@yahoo.com> wrote:

> To repeat: vnodes represent nodes *on the screen*.

This seems like something of an oversimplification to me.  The topology
of vnodes defines the context in which your referencing a tnode.

Yes, you are correct.  I misspoke.  Vnodes *used to* correspond to nodes on the screen, but that is ancient history.

The diagram you present looks about right, but I typically don't think in terms of the underlying data structures, and I certainly don't recommend thinking about Leo outlines that way. In fact, I recommend *ignoring* all ivars of the vnode and tnode classes, with the single exception that vnodes have a t ivar that points to the vnode's tnode.

Let me try to cut through the confusion by looking at Leo outlines in an entirely different way.  Rather than worrying about vnodes and tnodes, let us turn our attention to positions.  This was Bernhard Mulder's great insight: *however* we represent Leo trees (that is, Leo's DAG's), what we typically want to do is traverse the tree in various ways.

Consider this common fragment:

for p in c.allNodes_iter():
..<< do something with p >>

This will visit each "node" exactly once, where by "node" we mean what would appear on the screen if all nodes were expanded.  The **crucial** point to notice is that "node" in this sense corresponds *neither* to the internal vnode structure nor to the internal tnode structure.  Indeed, we shall visit descendants of cloned nodes once per each clone.  (And no, this statement is not correct when a "node" has multiple cloned parents).

But the point is this: the c.allNodes_iter iterator *is* an accurate representation of a traversal of the Leo DAG.  It *doesn't matter* if vnodes are visited more than once. And if you would like, for some strange reason, to focus on vnodes or tnodes instead of positions, you can use c.all_vnodes_iter() or c.all_tnodes_iter().  Finally, Leo has iterators that traverse vnodes and tnodes exactly once, namely
c.all_unique_vnodes_iter() and c.all_unique_tnodes_iter()

In short, no matter how you want to traverse a Leo outline, you can do so without knowing *anything* about the internal data structures.  This is as it should be, and it is the iterator-centric view of Leo outlines that I highly recommend.

Edward

P.S. Having said all this, there are times when it might indeed useful to understand just when iterators will visit vnodes and tnodes.  In particular, it could be useful to know exactly how often iterators will visit a vnodes, so that one can uses v.unknownAttributes effectively.  For example, one can imagine doing the following so that it would be possible to mark joined nodes independently of each other:

1. Associate a mark with a vnode/visited count.

2. Alter Leo's tree drawing algorithm so that a mark is drawn by a node only if a) the vnode is marked and b) the present visited count of the node matches.

Perhaps this is a fanciful example: mostly you must resign yourself to the fact that absent such "heroic" hacks all vnodes joined to each other *will* be drawn the same.

EKR

Edward K. Ream

unread,
Feb 26, 2008, 2:04:02 PM2/26/08
to leo-editor
On Feb 26, 12:16 pm, "Edward K. Ream" <edream...@gmail.com> wrote:

> The **crucial** point is that "node" in this sense corresponds *neither* to the internal
> vnode structure nor to the internal tnode structure.

I am beginning to wonder whether the distinction between vnodes and
tnodes is really needed. I am reluctant to even ask this question
because a) the present code works and b) tnodes are, in fact, a nice
"little" optimization: tnodes represent the part that is shared
between cloned nodes.

I wouldn't even "think" this question if it were not for the perennial
conceptual problems arising from questions like:

1. Should vnodes have gnx's?

2. Should we attach uA's to vnodes or to tnodes.

In other words, all this confusion arises from the "little"
optimization. Indeed, it is natural to think of DAG's as being
composed of "just plain nodes" (neither vnodes nor tnodes, just
nodes). By departing from this (seemingly natural) representation,
Leo has created ongoing conceptual problems.

OTOH, tnodes solve a *lot* of otherwise tricky internal problems, so
careful thought is needed.

Without tnodes, cloned "just plain nodes" must contain *copies* of all
data structures that used to be in tnodes. Updating those structures
is possible, using the equivalent of v.t._vnodeList, but it would be a
significant rewrite of the internals.

Let's look again at the two questions above.

1. Should "just plain nodes" have gnx's? Obviously yes, but should
cloned nodes have the same gnx or different gnx? My guess is that
they should have different gnx's (so different clones could have
different marks, for example). However, cloned nodes also need a gnx
representing the fact that they are clones of each other--something
unique and immutable, which is exactly what a gnx is.

2. Should we attach uA's to vnodes or to tnodes? If we attach a uA to
"just plain node"'s uA field, the uA obviously gets associated with
just the single node. OTOH, assuming all cloned "just plain node"'s
share a common gnx, cloned nodes could *also* have a "shared" uA
field.

Very interesting. Point 2 says that the situation is pretty much the
same whether or not we have tnodes. t.uA corresponds to
node.sharedUA; v.uA corresponds to node.uA.

Point 1 asks, should vnodes have gnx's? This is nasty. It touches on
some of the biggest implementation hacks in all of Leo--specifically
the so-called "hidden machinery". I simply can't say for sure at
present.

In short, tnodes still seem useful, and the question about whether
vnodes should (or even *can*) have gnx's remains. To paraphrase a
quip about quantum mechanics, if you think you understand gnx's for
vnodes, then you don't understand gnx's for vnodes :-)

Edward

Edward K. Ream

unread,
Feb 26, 2008, 2:30:16 PM2/26/08
to leo-editor
On Feb 26, 1:04 pm, "Edward K. Ream" <edream...@gmail.com> wrote:

> To paraphrase a quip about quantum mechanics, if you think you understand gnx's for
> vnodes, then you don't understand gnx's for vnodes :-)

Well isn't this interesting. I just had an entirely new thought about
Leo trees. Joined nodes are visited more than once during a complete
tree traversal--in fact, once per every permutation of cloned ancestor
nodes. Suppose we associate every permutation with some unique id:
the specific permutation of cloned ancestor nodes would be natural.
The Aha is that we can then use that id to identify traversal-specific
uA's! In other words, vnodes or tnodes could have multiple uA's: one
for each visit in a complete traversal.

I'm not saying that such uA's would be immensely valuable. In fact, I
suspect they would have limited value. For example, they are *not*
going to overturn The Great Graph Aha. But still, it's a cute
thought.

Edward

P.S. I am starting to suspect that any different internal
representation of Leo DAG's will be, in some sense, isomorphic to the
present shared subtree representation. If that is the case, there
will be little or no reason to change the internals.

EKR

derwisch

unread,
Feb 26, 2008, 3:15:27 PM2/26/08
to leo-editor


On 26 Feb., 20:04, "Edward K. Ream" <edream...@gmail.com> wrote:
> 2. Should we attach uA's to vnodes or to tnodes.

In the application I am working on, it is absolutely vital to be able
to attach uA's to both tnodes and vnodes.

I am trying to write an editor for the CDISC Operational Data Model
(http://www.cdisc.org/models/odm/v1.2/ODM1-2-0.html), which is
conceived for the recording of data for clinical trials. As some of
the data fields will be reused (such as a lab measurement pre and post
treatment; you don't want to redefine the lab measurement between
timepoints), the strict tree hierarchy of XML is broken by introducing
Object IDs which can be referenced from elsewhere: ItemRef elements
can sit within ItemGroupDef elements and point to ItemDef elements
sitting elsewhere.

This is a 1:1-correspondence to the DAG structure found in Leo.
Additionally, though, --Ref nodes as well as --Def nodes may have
attributes. The most prominently used attribute which is position
dependent is the attribute called "Mandatory (= Yes|No)", and indeed,
a lab measurement may be required at screening but optional at follow-
up. So while tnodes may have been selected out of programming
convenience, they offer a perfect isomorphism for the problem domain
at hand: --Ref attributes in ODM correspond to vnode uAs and --Def
attributes to tnode uAs. Shared trees are fine; the same concept is
used in ODM.

Therefore, simplifying the node concept in Leo would have me up in
arms.

Edward K. Ream

unread,
Feb 26, 2008, 3:49:36 PM2/26/08
to leo-e...@googlegroups.com
On Tue, Feb 26, 2008 at 2:15 PM, derwisch <johannes...@med.uni-heidelberg.de> wrote:

Therefore, simplifying the node concept in Leo would have me up in arms.

Oh good :-)  Really, the last thing I want to do is mess around with Leo's data model.

Edward

Edward K. Ream

unread,
Feb 26, 2008, 5:05:08 PM2/26/08
to leo-editor
On Feb 26, 1:04 pm, "Edward K. Ream" <edream...@gmail.com> wrote:

> In short, tnodes still seem useful, and the question about whether
> vnodes should (or even *can*) have gnx's remains. To paraphrase a
> quip about quantum mechanics, if you think you understand gnx's for
> vnodes, then you don't understand gnx's for vnodes :-)

This quip got the better of me. In fact, the situation isn't so hard.
Here is the relevant passage from LeoDocs.leo:

QQQ
**Important**: Plugins must *not* use v.unknownAttributes inside
``@thin``trees. Indeed Leo uses **hidden machinery** to write
t.unknownAttributes. Leo does *not* write t.unknownAttributes to thin
derived files. Instead Leo writes a representation of all
t.unknownAttributes contained in the @thin tree to a special xml
attribute called descendentTnodeUnknownAttributes in the <v> element
corresponding to the @thin node. Yes, this is complicated, but it
works. Leo can *not* write v.unknownAttributes in @thin trees because
**only tnodes have gnx's in thin derived files**. In effect, vnodes
are anonymous.
QQQ

This accurately describes the present situation, but suppose vnodes
*did* have gnx's. Couldn't we extend Leo's hidden machinery so as to
be able to write v.uA and v.gnx for the descendants of @thin nodes? I
don't see why not. In fact, we can easily prove this is possible.
Indeed, we can put the required info in the
descendentTnodeUnknownAttributes fields! No, I'm not going to do
that: it's just an existence proof.

So here are the proposed changes:

1. Add a vx attribute to all <v> elements in .leo files. This value
of this attribute will be a gnx, exactly as for the tx attribute in
all <t> elements.

2. Extend the hidden machinery so that the gnx's for vnodes for
descendants of @thin nodes get remembered. Important: just as we can
not absolutely guarantee that descendantTnodeUnknowAttributes fields
will always be restore, neither can we absolutely guarantee that the
corresponding hidden machinery will be able to restore gnx's for
vnodes. In practice, this is very seldom a problem, but information
can be dropped if the derived file being read does not correspond to
the info written by the hidden machinery.

BTW, Leo 4.5 is the right time to make these changes, because changes
are already planned for Leo's xml file format and corresponding read/
write code. As usual, the changes will be upward compatible with
older versions: Leo 4.5 will be able to read .leo files from all
previous versions, but the converse is not true. That is, once Leo
4.5 writes a .leo file, older versions of Leo will not be able to read
that file. This should be of little or no concern to Leo's users.

Edward

Stephen P. Schaefer

unread,
Feb 27, 2008, 1:04:05 PM2/27/08
to leo-e...@googlegroups.com
On Tue, 2008-02-26 at 11:04 -0800, Edward K. Ream wrote:

> Without tnodes, cloned "just plain nodes" must contain *copies* of all
> data structures that used to be in tnodes. Updating those structures
> is possible, using the equivalent of v.t._vnodeList, but it would be a
> significant rewrite of the internals.
>

Before asking the following question, I must note that there is so much
code making good use of the vnode/tnode distinction that I think it
would be a bad idea to change Leo's internals.

The question is: why would "just plain nodes" need to have copies of
data structures? The only distinction I can think of in "just plain
leo" between one vnode and another is whether it is expanded or not, and
that has never been an essential feature for me (that is, if I
expanded/collapsed one vnode and all its clones instantly
expanded/collapsed as well, I'd be neither annoyed nor pleased).
Consider the following data structure for a node of a DAG:

node_data # header, body
ordered_list_of_children # references to other nodes

That data structure is also sufficient for a general graph, and contains
no optimizations or constraints appropriate to a DAG. Aside from being
a conceptual underpinning, such a simplification might help with design
of alternative access to the Leo data, e.g., as represented in a
database or via AJAX.

Or perhaps I'm missing something obvious; if so, please do enlighten me.

- Stephen


Edward K. Ream

unread,
Feb 27, 2008, 1:50:05 PM2/27/08
to leo-e...@googlegroups.com
On Wed, Feb 27, 2008 at 12:04 PM, Stephen P. Schaefer <ssch...@acm.org> wrote:

why would "just plain nodes" need to have copies of data structures?

Without tnodes, there would be nothing to carry the outline or body strings.  The copies would only be needed when two "just plain nodes" were clones of each other.

I think we are all agreed at present that vnodes and tnodes are here to stay.

Edward

Edward K. Ream

unread,
Apr 14, 2008, 9:44:54 AM4/14/08
to leo-editor
On Feb 26, 3:15 pm, derwisch <johannes.hues...@med.uni-heidelberg.de>
wrote:

> I am trying to write an editor for the CDISC Operational Data Model
> (http://www.cdisc.org/models/odm/v1.2/ODM1-2-0.html), which is
> conceived for the recording of data for clinical trials. As some of
> the data fields will be reused (such as a lab measurement pre and post
> treatment; you don't want to redefine the lab measurement between
> timepoints), the strict tree hierarchy of XML is broken by introducing
> Object IDs which can be referenced from elsewhere: ItemRef elements
> can sit within ItemGroupDef elements and point to ItemDef elements
> sitting elsewhere.

> This is a 1:1-correspondence to the DAG structure found in Leo.

Unlikely. And even if true, it is misleading.

> Therefore, simplifying the node concept in Leo would have me up in arms.

There is a much easier way, one that sidesteps the uA issue, and is
more Leonine. I'm not sure exactly what the way is, but I now for the
sure it exists :-)

The general form of the solution would be:

- Use a high-level tool such as ElementTree to massage the input data
into a Python data structure.
- Map that data structure into a Leo outline.

The essence of your problem is that some parts of data are shared, and
other aren't. Well, you don't have to cram all the data into a single
Leo node, regardless of whether we are talking about the old world or
the new world. Instead, create "trial nodes" with children. Some of
those children will be clones, and therefore shared. Other children
will be unique.

This kind of problem is very easily solved in Leo. Your task is to
write a script to convert the input data into Leo outline that uses
clones properly. You can prototype a node structure by hand. Try it,
you'll see it's much easier than you imagine.

To put it another way: your Leo outline doesn't have to slavishly
mirror the format of the .xml file: scripts (or commands created by
@button or @command nodes) can read and write the .xml file from
whatever Leo outline you please.

HTH.

Edward

Edward K. Ream

unread,
Apr 14, 2008, 10:32:19 AM4/14/08
to leo-editor
On Feb 26, 3:15 pm, derwisch <johannes.hues...@med.uni-heidelberg.de>
wrote:

> Therefore, simplifying the node concept in Leo would have me up in arms.

I am so glad you posted this, along with the detailed notes about what
you are wanting to do. In answering your question, I am now more
convinced than ever that unified nodes are a big step forward.

1. The old dual-node scheme was a open invitation to massive confusion
and bad style.

As I understand it, you propose to create clones of data, like this:

- a(1)
- a(2)

And then you distinguish between a(1) and a(2) using
v.unknownAttributes. Imo this is very bad style. You can not be
blamed: the fault is Leo's for giving you two flavors of uA's.

A much better style would be the following. It is based on the
observation that view nodes are *not* clones, they *contain* clones.
So the organization would be:

- trial (summary view)
- common trial data(clone)
- trial view 1
- common trial data (clone)
- data local to trial 1
- trial view 2
- common trial data (clone)
- data local to trial 2

Furthermore, you can create clones of the 'trial view 1' and 'trial
view 2' nodes and put those clones anywhere you like in the outline.
My guess is that this kind of organization gives you much more
flexibility than you had before. You can attach a uA to any of the
nodes, and there will be no need ever to distinguish what the uA
contains based on the location of the node, or whether it is a clone
or not, or on any other criterion except what the node *is*.

So we see that distinguishing between vnodes and tnodes naturally
leads people to *bad* style. This kind of mistake will simply not be
possible to make in the unified node world.

2. Point 1 also shows why I am not enthusiastic about the graph world,
even if some low-level impediments will be removed in the unified-node
world. Indeed, **views do not exist in the graph world**. Or rather,
if they do exist, they will be a contrived combination of hard-to-
understand iters and specialized conventions. Imo, the essence of
understanding and manipulating data is the creation of arbitrarily
many views on the data. This is true regardless of the scale of the
problem: it is true for the human genome project, or for any other
project. No exceptions.

Edward

derwisch

unread,
Apr 14, 2008, 3:29:05 PM4/14/08
to leo-editor
Edward, I am very glad you are sharing your thoughts in this depth.
At the same time I still feel stymied as I just sent in the slides,
where it is basically stated that Leo's data structure basically
reflects
ODM, that Leo is being developed for 10 years and has matured,
is stable, unlikely to undergo big changes etc. It seems like the
removal to Google Groups and Launchpad has re-vitalised the
project (not that it ever felt dead anyway) and spurred not only
development, but the thinking abot the general data model. Again,
I appreciate that you are helping existing projects to not fall behind
the wayside. Of course I always saw some redundancy in the
current data model, but I mostly failed to distinguish between p
and vnodes.

On 14 Apr., 16:32, "Edward K. Ream" <edream...@gmail.com> wrote:
> On Feb 26, 3:15 pm, derwisch <johannes.hues...@med.uni-heidelberg.de>
> wrote:
>
[...]
> As I understand it, you propose to create clones of data, like this:
>
> - a(1)
> - a(2)
>
> And then you distinguish between a(1) and a(2) using
> v.unknownAttributes. Imo this is very bad style. You can not be
> blamed: the fault is Leo's for giving you two flavors of uA's.

A rose by any other name. If you look at the ODM you will see
that there is a distinction between --Ref and --Def elements, and
that some attributes are peculiar to the reference and some to
the definition. You may refer to it as bad style but I was elated
to see this structure mirrored by Leo.

>
> A much better style would be the following. It is based on the
> observation that view nodes are *not* clones, they *contain* clones.
> So the organization would be:
>
> - trial (summary view)
> - common trial data(clone)
> - trial view 1
> - common trial data (clone)
> - data local to trial 1
> - trial view 2
> - common trial data (clone)
> - data local to trial 2
>

It is quite obvious to me that you can still model DAGs with
the data model to be. You just need auxiliary nodes, somehow
like the --Ref nodes, which currently can be abstracted away.

The question is, how do I hide the auxiliary nodes from the
user. I would really like to preserve the tree view which is
for instance seen in the screenshot of this offer: http://www.xml4pharma.com/SDTM-ETL/

The other suggestion was outlined by you and Terry in the
parallel thread: to somehow annotate the position during a
traversal of the tree. You call that easy, but then you are a
programmer.

Edward K. Ream

unread,
Apr 14, 2008, 5:52:48 PM4/14/08
to leo-e...@googlegroups.com
On Mon, Apr 14, 2008 at 2:29 PM, derwisch <johannes...@med.uni-heidelberg.de> wrote:

Edward, I am very glad you are sharing your thoughts in this depth.
At the same time I still feel stymied as I just sent in the slides,
where it is basically stated that Leo's data structure basically
reflects
ODM, that Leo is being developed for 10 years and has matured,
is stable, unlikely to undergo big changes etc.

Oh my.  Just when the old way appeared perfect for you, Leo gets a makeover.

A rose by any other name. If you look at the ODM you will see
that there is a distinction between --Ref and --Def elements, and
that some attributes are peculiar to the reference and some to
the definition. You may refer to it as bad style but I was elated
to see this structure mirrored by Leo.

Well, there is no standing still.  We will have the unified node world, so we must solve your problem, and cleanly.

I am afraid I still don't understand the problem.  What you present to the user is just another view. It need not contain nodes that you don't want to show.

I think it is imperative to distinguish the data as contained in the .xml from the views on the data presented to the user.  If necessary, scripts can alway put a subset of data into nodes destined to be seen by the user.

So the general idea is: put data that must be contained in multiple views in a node that can then be cloned and added to those views.  Very easily done in a script.

I guess the overall point is that *you* (and your scripts) are in complete control of what data gets put where.  If the .xml mirrors you desires, fine.  If not, a simple script should suffice to put the data exactly where you want it in the outline.

Note that scripts in @script nodes will be executed when you open a .leo file. Generally speaking, such scripts are dangerous: You must enable @script nodes in the mod_scripting plugin.  If you don't like the security implications of @script nodes, you can prototype your scripts with @button nodes, and then you can move your startup scripts into a plugin.

Again, the point is that you and your scripts should be able to make the nodes presented to the user look *exactly* how you want them to look, whether or not the nodes are clones or not.

The question is, how do I hide the auxiliary nodes from the
user.

You have lots of options.  You can put them in a chapter.  You can put them in a node that is "out of the way" and not usually viewed.

If you *really* want to hide data, then you can put the data in a uA in any node you please.

A plugin could augment the tree-drawing code so it doesn't show nodes whose headlines start with @hidden.  It would take just a few lines of code to do this.  You might want to add show-hidden-nodes or hide-hidden-nodes commands :-)

 
I would really like to preserve the tree view which is
for instance seen in the screenshot of this offer: http://www.xml4pharma.com/SDTM-ETL/

I see no reason why you couldn't do this.
 
The other suggestion was outlined by you and Terry in the
parallel thread: to somehow annotate the position during a
traversal of the tree. You call that easy, but then you are a
programmer.

I strongly suggest not going in this direction.  It's really, really ugly, and it will not be easy to do correctly.

I just don't see how I can be wrong about this.  In my eyes, you are misusing clones so that they appear differently in different contexts.

Yes, the new world would require change to your code at precisely the time you thought it was completely stable.  Of course, if you must, you can base your product on the old world.  It's always a valid option.  But I strongly believe that a little invention on your part will open the door to a much more flexible solution in the long run.

Edward

Terry Brown

unread,
Apr 14, 2008, 6:15:31 PM4/14/08
to leo-e...@googlegroups.com
On Mon, 14 Apr 2008 16:52:48 -0500

"Edward K. Ream" <edre...@gmail.com> wrote:

> > The other suggestion was outlined by you and Terry in the
> > parallel thread: to somehow annotate the position during a
> > traversal of the tree. You call that easy, but then you are a
> > programmer.
>
> I strongly suggest not going in this direction. It's really, really
> ugly, and it will not be easy to do correctly.

Actually I wasn't suggesting "annotating the position", but rather just
iterating through p.stack (which in unified node world will make more
sense) to determine context.

But now that you mention "annotating the position" it seems that there
might be uses for callbacks invoked by a position as it moves through
the tree. I'm not sure it would achieve anything you couldn't do by
iterating p.stack, but it might save time.

A
B
C
D
E

If a position gets to D and it has called a registered callback at each
step (with itself as an argument, no doubt) it could already contain
information collected from A, B, and C on the way, instead of
collecting information from A, then A and B, then A and B and C, then
A-D, as would be the case with iterating p.stack.

There'd need to be some invocation of the callback as the position
moved back up, so that B,C,D info. could be dumped when you get to E.
Of course this is all assuming too much about how iterators work with
positions without checking the code, but I think it's interesting in
the context of, er, context.

> I just don't see how I can be wrong about this. In my eyes, you are
> misusing clones so that they appear differently in different contexts.

Cleo does a similar thing, although changing it probably wouldn't have
any noticeable impact on cleo. I think using vnodes to distinguish
context has been an obvious, even if unintended, choice, up to now.
Positions have evolved to more important and sophisticated entities
over time.

Cheers -Terry

thyrsus

unread,
Apr 14, 2008, 6:31:54 PM4/14/08
to leo-editor
In current leo, the data that could be associated with a vnode within
the leo file is the following (taken from leoNodes.py):

# Archived...
clonedBit = 0x01 # True: vnode has clone mark.

# not used = 0x02
expandedBit = 0x04 # True: vnode is expanded.
markedBit = 0x08 # True: vnode is marked
orphanBit = 0x10 # True: vnode saved in .leo file, not derived file.
selectedBit = 0x20 # True: vnode is current vnode.
topBit = 0x40 # True: vnode was top vnode when saved.

Arbitrary further attributes could be added by plugins.

The proposal for "unified nodes" moves this data to the tnode. E.g.,
the clonedBit is derived from the tnode (i.e., lenght of "parent"
list > 1). The expanded bit becomes an attribute of the tnode instead
of the vnode, and as I had foreseen, all nodes with the same tnode get
expanded and contracted in concert.

in the new regime, it may be beneficial to continue attaching saved
information for the directed link between one node and another,
although perhaps in a sparser manner than currently: instead of
putting the entire expanded tree in this new form of leo file, "link"
data could default to "unexpanded, unmarked,unselected,nottop", and a
link data item could be written for the exceptions. What identifies a
link? *Not* the full path from the (implicit) root node,that is.,
*not* a position. Instead, only the node from which it proceeds
(either a normal node or the inherent "root" node) and the index of
the child to which it proceeds (never the inherent root node). Nodes
are uniquely identified by their gnx, so such a element would look
like

<link from="gnx1" cx="4" a="EMOTV" unknAtrr1="foo" />

Leo would check that that the gnx1 node is defined within the graph
and that there are a sufficient number of children for that node, or
report an error in the .leo file.
Nodes would look like the current tnodes, but with child lists. As I
noted earlier, a node is defined by

node_data # header, body
ordered_list_of_children # references to other nodes

which could be represented by an element similar to the current <t>,
but with the newly consolidated data: the gnx, the child list, the
headline, and the body:

<node tx="gnx1" cl="gnx2,gnx3,gnx3" h="The headline">The body
</node>

Alternatively,

<node tx="gnx1" cl="gnx2,gnx3,gnx3">
<h>The headline</h>
<b>The body
</b>
</node>

Leo would need to assure that child nodes were defined, or generate a
warning that the leo file was corrupt and that it had created minimal
nodes to satisfy, something like "Missing Headline" and and empty
body. I leave it as an exercise for the reader to find the algorithm
assuring that the nodes describe a DAG and not a generalized graph ;-}

- Stephen

On Apr 14, 3:29 pm, derwisch <johannes.hues...@med.uni-heidelberg.de>
wrote:

Terry Brown

unread,
Apr 14, 2008, 7:05:30 PM4/14/08
to leo-e...@googlegroups.com
On Mon, 14 Apr 2008 15:31:54 -0700 (PDT)
thyrsus <ssch...@acm.org> wrote:

> <link from="gnx1" cx="4" a="EMOTV" unknAtrr1="foo" />

The issue of whether links carry no attributes at all, or only a
(perhaps implicit) "direction" attribute, or any number of attributes,
is interesting.

If I'm understanding, the current unified nodes proposal is that links
carry no attributes (apart from the trivial implicit direction, either
child list or parent list). So links have nothing interesting to tell
us except where they point. An they can just be a python list of
references to nodes.

Once you let them carry additional attributes they have to be objects in
their own right, like vnodes. The alternative is to use "intermediate"
regular nodes to carry link related information when required.

For links carrying arbitrary attributes, where upper case are nodes and
lower case are links, you could represent something as:

A-+-a-B
|
+-b-C
|
+-c-D

where a, b, and c can tell you things about how A relates to B, C, and
D.

Vs. links that carry no attributes, again upper case are nodes and lower
case links:

A-+-x-M-x-B
|
+-x-N-x-C
|
+-x-O-x-D

I use 'x' for all the links because while they're all separate links,
there all just python list entries. M, N, and O are additional nodes
used to carry link information.

I think the latter's the unified node plan for cases where you need to
store information about links.

Not trying to illuminate anyone, just working through my understanding
to see if it's on the right page.

Cheers -Terry

thyrsus

unread,
Apr 14, 2008, 7:18:16 PM4/14/08
to leo-editor

On Apr 14, 7:05 pm, Terry Brown <terry_n_br...@yahoo.com> wrote:
> Not trying to illuminate anyone, just working through my understanding
> to see if it's on the right page.

Nonetheless, I now believe I understand Edward's proposal much
better. Thanks.

- Stephen

Edward K. Ream

unread,
Apr 14, 2008, 8:07:17 PM4/14/08
to leo-e...@googlegroups.com
On Mon, Apr 14, 2008 at 4:52 PM, Edward K. Ream <edre...@gmail.com> wrote:

Of course, if you must, you can base your product on the old world.  It's always a valid option.

On second thought, this isn't nearly good enough for you.  No way do I want to consign one of Leo's most important users to obsolete status.

Instead, my mission is to solve your transition problem in a way that clearly works better for you than the old way.  I want you to be an enthusiastic supporter of unified nodes.  Nothing less will be good enough for me.

Edward

Edward K. Ream

unread,
Apr 15, 2008, 8:01:35 AM4/15/08
to leo-editor
On Apr 14, 7:07 pm, "Edward K. Ream" <edream...@gmail.com> wrote:
> On Mon, Apr 14, 2008 at 4:52 PM, Edward K. Ream <edream...@gmail.com> wrote:
>
> Of course, if you must, you can base your product on the old world. It's
>
> > always a valid option.
>
> On second thought, this isn't nearly good enough for you. No way do I want
> to consign one of Leo's most important users to obsolete status.
>
> Instead, my mission is to solve your transition problem in a way that
> clearly works better for you than the old way. I want you to be an
> enthusiastic supporter of unified nodes. Nothing less will be good enough
> for me.

Johannes, when I awoke this morning, I realized that your objections
are much more important than I realized at first. This may be an
opportunity in disguise, or it may spell big trouble for the unified-
node world. More invention is required, perhaps theoretical, perhaps
practical.

Several question appear, at different levels of the design/
implementation hierarchy:

1. Perhaps most importantly, in the unified-node world, are Leo
outlines a proper representation of a DAG? If so, what is the graph-
theoretical correspondence between the graph and Leo nodes?

That is, suppose two nodes in a directed graph "point to" a shared
node. What is the corresponding outline in the unified-nodes world?
Does the notion of cloned node "shift" in the unified-node world.
That is do, different kinds of nodes get a clone mark?

2. How do clones get a clone mark? :-) This morning I realized that
the code for setting the clone mark has no chance of working: another
data structure (a node ivar) will be needed. Could this data
structure be used to disambiguate different instances of the *same*
node?

3. Are cloned nodes properly updated in synch? That would seem to be
guaranteed, because cloned nodes are the *same* node, but now not even
that seems certain. Or rather, it's not clear that the same nodes get
cloned marks in the new world.

I was lead to these question by considering another question. In the
unified-node world, could we create a notion of "joined" nodes that
are *different nodes* such that some or all aspects of the different
nodes get updated in unison. For instance, we could conceive of
different nodes whose headlines Leo guarantees to be the same. Or
body text. Or children. Or uA's. This kind of joining is a
frequently requested feature. Is it easier in the unified-node
world? Or does it, as seems more likely, lead us back to the same
problems as existed in the ancient Leo world where no subtrees were
shared?

4. Can we create a natural mapping from CDISC Operational Data Model
to unified nodes? Would that mapping use clones in the same way? Is
there a mapping that uses clones in a different way, but would be just
as useful, or as seems likely to me, even more useful. I'll be
looking at this model in much more detail now that the stakes seem so
much higher.

In short, the initial clarity has turned murky again. As always, this
is a sign that more invention is needed. I believe the driving force
behind the new inquiry will be a comparison of how DAG's are
represented in the old and new world. For sure, we can not commit to
the unified-node world until all such issues are resolved to
everyone's satisfaction. Including you, Johannes :-)

I'll be exploring these issues from the top down, theoretically, and
from the bottom up, by debugging the unified-node code. It may be
that there will be surprises when I create clones in the new world.
Those surprises could highlight further problems, or suggest clearer
solutions.

Edward

Edward K. Ream

unread,
Apr 15, 2008, 8:24:55 AM4/15/08
to leo-editor


On Feb 26, 3:15 pm, derwisch <johannes.hues...@med.uni-heidelberg.de>
wrote:
It seems to me that if we could create *another* perfect isomorphism
between ODM attributes and Leo outlines we could get on with the
unified-node world. This was the essence of my initial, too-hasty
reply. This new isomorphism would require new code for your editor,
but that, by itself, would not invalidate the unified-node world.

Part of the problem for me is that the screen shots do not tell me
enough about your situation. Can user's modify data in your editor?
Can they change it's organization? How does your data convert from
ODM to a Leo outline? How does your editor draw tables using the Leo
outline? that kind of thing.

Hmm. Would it be possible for me to use your editor and peek at the
code? That might give some crucial insights that words could not
easily deliver.

Edward

derwisch

unread,
Apr 15, 2008, 9:38:24 AM4/15/08
to leo-editor
On Apr 15, 2:01 pm, "Edward K. Ream" <edream...@gmail.com> wrote:
> >  No way do I want
> > to consign one of Leo's most important users to obsolete status.

I hope there are more "important" (by any metric) users of Leo than
myself.

> 1. Perhaps most importantly, in the unified-node world, are Leo
> outlines a proper representation of a DAG?  If so, what is the graph-
> theoretical correspondence between the graph and Leo nodes?

My take on it is as follows:

The current implementation represents a vertex- and edge-labelled
DAG. Unified nodes lose the possibility to label edges.

> I was lead to these question by considering another question.  In the
> unified-node world, could we create a notion of "joined" nodes that
> are *different nodes* such that some or all aspects of the different
> nodes get updated in unison.  For instance, we could conceive of
> different nodes whose headlines Leo guarantees to be the same.  Or
> body text.  Or children.  Or uA's.  This kind of joining is a
> frequently requested feature.  Is it easier in the unified-node
> world?  Or does it, as seems more likely, lead us back to the same
> problems as existed in the ancient Leo world where no subtrees were
> shared?

I suspect most of the requests stem from misplaced analogies about
object inheritance or parameterised macros.

> 4. Can we create a natural mapping from CDISC Operational Data Model
> to unified nodes?  

Very obviously, one can always map a vertex-and-edge labelled graph
to a vertex labelled graph by substituting edge, annotated vertex,
edge
for annotated edge.

If you mean by "natural" "intuitive to the user", I would tend to
"no".
The intermediate (or "auxiliary") nodes should at best not be visible
to the user. The CDISC Operational Data Model is represented in
XML, and it uses the crutch of --Ref and --Def elements and OIDs
to embed shared subtrees and DAG structure in the tree structure
of XML. This is OK for a data interchange format to be read by
machines, less than OK for me when I write XSLT filters and not
so nice for human reading. It is like reading separate table of a
relational
database and jumping from table to table looking for matching foreign
keys. Leo in its future form solves the DAG problem. Leo in its
present
form solves the annotation of edges.

> I'll be
> looking at this model in much more detail now that the stakes seem so
> much higher.

This is most considerate of you and much more than I can expect
from a free software project.

> For sure, we can not commit to
> the unified-node world until all such issues are resolved to
> everyone's satisfaction.  Including you, Johannes :-)

If this is the case, I'll warm-heartedly embrace the change.

derwisch

unread,
Apr 15, 2008, 10:01:34 AM4/15/08
to leo-editor


On Apr 14, 11:52 pm, "Edward K. Ream" <edream...@gmail.com> wrote:
(I wrote:)
>> The question is, how do I hide the auxiliary nodes from the
>
> > user.
>
> You have lots of options.  You can put them in a chapter.

OK; I can flesh out all non-intermediate nodes and present them to
the user as clones. But then I have the clone mark on every node,
where I would like the clone mark to be significant to the user.

The thing is that the --Ref attributes (in CDISC ODM) are few and not
too often made use of. In an earlier version I had combined --Ref and
--Def attributes in tnode uA's (until I rejoicingly noticed that
tnodes
and vnodes can be annotated separately). There is a possibility that
all --Ref attributes can go into unknownAttributes of the mother node,
but that may mean that these uA's have to be updated whenever a
child node is deleted or moved. But then I always wanted to know
about plugins and hooks anyway.


Edward K. Ream

unread,
Apr 15, 2008, 11:20:18 AM4/15/08
to leo-e...@googlegroups.com
On Tue, Apr 15, 2008 at 8:38 AM, derwisch <johannes...@med.uni-heidelberg.de> wrote:

I hope there are more "important" (by any metric) users of Leo than
myself.

No, there aren't :-)
 
> 1. Perhaps most importantly, in the unified-node world, are Leo
> outlines a proper representation of a DAG?  If so, what is the graph-theoretical correspondence between the graph and Leo nodes?

My take on it is as follows:

The current implementation represents a vertex- and edge-labelled
DAG. Unified nodes lose the possibility to label edges.

Interesting.  Before reading this I came up with a general solution that seems to relate to this idea directly.

Suppose we make the vnode-tnode relationship explicit as follows.  Consider an *organizer node*, the analog of the vnode, containing a child node, which may be cloned or not.  Like this:

- vnode (organizer node, aka node A)
  tnode (contained node, aka node B)

To simulate cloning the organizer node in the *old* world, we create a new node, say node A-prime, clone node B and move the clone to be a child of node A-prime.  We get:

- vnode A
  - tnode B (clone)
- vnode A-prime
  - tnode B (clone)

Now here is today's aha.  To make this work, **we simply hide B**.  That is, we set a "hidden" attribute in tnode B, so Leo's draw code never shows it. What we see on the screen is:

- vnode A
- vnode A-prime

but the *contents* of A and A-prime are the same.

BTW, my first thought was to hide the organizer node, but that leads instantly to complications when we try to move the tnode.  So the Aha could be stated: *hide the tnode*.  This is pleasing because indeed tnodes *are* hidden in the old world.

Naturally, it will be easy to augment Leo's tree-drawing code to support "bits" such as hide-this-node, hide-this-node's-children, etc.

I suspect most of the requests [for joined nodes] stem from misplaced analogies about object inheritance or parameterised macros.

I agree.  I never paid much attention to them :-)  However, we might want to provide support for joined headlines and/or body text.  Joined subtrees are never going to happen, and now we can see why: it is possible to simulate such things using unified nodes.  But linked headlines and body text allow a complete simulation of what your application looked like previously.

Clone marks

Now is the time to mention another piece of the puzzle.  In order to set the clone mark, (unified) nodes must contain another ivar.  Provocatively, let us call this ivar n.vUa. (It will probably be called something like n.cloneInfo).  vUa will contain zero or more dictionaries.  We add a dict to vUa when creating or cloning a node, and delete a dict when deleting a node.  A node n is cloned iff len(n.vUa) > 1.

Leo's file-read code will place entries in n.vUa when encountering a <v> element.  If the <v> element has an unknownAttributes attribute, the the corresponding dict is added to the created node's vUa list.  Otherwise, an empty dict will be added.  This solves several problems at once: it honors the old file format, and it causes clone marks to be set correctly by the read code.

Imo, we are now very close to being able to solve all problems surrounding the (old) v.unknownAttributes data.

1. The file read code will preserve v.unknownAttributes data.  Of course, application/plugin code will have to determine which element of n.vUa to use, based on context, but Johannes's code already must do that.  There will be a hook to allow user code to choose (again based on context) what element of n.vUa will be deleted when deleting a node, again, based on context.

2. Hidden child (t)nodes will allow Leo to simulate the old vnode/tnode relationship directly, and joined headlines and/or body text will complete the "simulation".  Application/plugin code (called by drawing events) could even simulate the clone mark if desired.

Conclusions

It appears possible to transform an old-world Leo outline into an equivalent new-world Leo outline in such a way that a) the outlines appear identical to the user and b) the data content of the outlines are isomorphic: with *explicit* vnode/tnode relationships modeling the former implicit vnode/tnode relationships.

Furthermore, the inventions needed to model vnode/tnode relationships explicitly will have other uses.

n.vUa is a natural way to associate context-dependent data with a *single* node.  Naturally, application code must find ways of doing this, presumably by looking up the tree for application-defined context nodes.  This kind of context search is *so* much simpler than position-based approached because positions change when nodes move.

Moving to the unified-node world will require real changes to Johannes's actual outline structure.  But that is simply inevitable.  Unless I am mistaken, however, the changes can be done automatically by revising the read code in Johannes's application.

Being able to model vnode/tnode relationships explicitly is another big advantage to the unified-node world.  The unified-node world is one of the most import Aha's in Leo's history.  We simply must make it work.

Edward

Edward K. Ream

unread,
Apr 15, 2008, 11:27:34 AM4/15/08
to leo-e...@googlegroups.com

OK; I can flesh out all non-intermediate nodes and present them to
the user as clones. But then I have the clone mark on every node,
where I would like the clone mark to be significant to the user.

The thing is that the --Ref attributes (in CDISC ODM) are few and not
too often made use of. In an earlier version I had combined --Ref and
--Def attributes in tnode uA's (until I rejoicingly noticed that
tnodes and vnodes can be annotated separately). There is a possibility that all --Ref attributes can go into unknownAttributes of the mother node, but that may mean that these uA's have to be updated whenever a
child node is deleted or moved. But then I always wanted to know
about plugins and hooks anyway.

I'm glad you have these options.  I'll be interested to know whether the approaches you just mentioned are easier for you than simulating the vnode/tnode relationships directly.

Edward

Terry Brown

unread,
Apr 15, 2008, 1:31:15 PM4/15/08
to leo-e...@googlegroups.com
On Tue, 15 Apr 2008 10:20:18 -0500

"Edward K. Ream" <edre...@gmail.com> wrote:

> Clone marks
>
> Now is the time to mention another piece of the puzzle. In order to
> set the clone mark, (unified) nodes must contain another ivar.
> Provocatively, let us call this ivar n.vUa. (It will probably be
> called something like n.cloneInfo). vUa will contain zero or more
> dictionaries. We add a dict to vUa when creating or cloning a node,
> and delete a dict when deleting a node. A node n is cloned iff
> len(n.vUa) > 1.

I would have thought, if unified nodes have both a children list and a
parent list, which I think they do according to a previous email, a
node n is cloned iff len(n.parents) > 1?

I can see how n.vUa might be constructed on read, but I don't see how
that works on write, would it not just appear that that dict belongs to
the node, i.e. is that nodes uA dict?

On second thoughts, perhaps I get it, the keys in n.vUa identify which
parent brought us to this node, they're sort of a parent specific uA?

Is that easier than doing something based on p.stack[-1]... assuming
that's a reference to the parent that brought us to this node? You can
only pick a entry from n.vUa if you know the context (position) and so
which parent is relevant?

Cheers -Terry

Edward K. Ream

unread,
Apr 15, 2008, 2:01:49 PM4/15/08
to leo-e...@googlegroups.com
On Tue, Apr 15, 2008 at 12:31 PM, Terry Brown <terry_...@yahoo.com> wrote:
 

I would have thought, if unified nodes have both a children list and a
parent list, which I think they do according to a previous email, a
node n is cloned iff len(n.parents) > 1?

That was my first thought too.  But two clones can share the same parent.
 

I can see how n.vUa might be constructed on read, but I don't see how
that works on write, would it not just appear that that dict belongs to the node, i.e. is that nodes uA dict?

On second thoughts, perhaps I get it, the keys in n.vUa identify which
parent brought us to this node, they're sort of a parent specific uA?

I'm not sure.  This is all blue-sky stuff.

Edward

Terry Brown

unread,
Apr 15, 2008, 2:26:13 PM4/15/08
to leo-e...@googlegroups.com
On Tue, 15 Apr 2008 13:01:49 -0500
"Edward K. Ream" <edre...@gmail.com> wrote:

> > I would have thought, if unified nodes have both a children list
> > and a parent list, which I think they do according to a previous
> > email, a node n is cloned iff len(n.parents) > 1?
>
> That was my first thought too. But two clones can share the same
> parent.

Yes... but they must also each have at least one additional parent in
order to be clones? So len(n.parents) > 1 holds?

I've been becoming increasingly unsure about the definition of "clone"
in the unified node world. At least, I see no problem mapping
current Leo's "clones" to the unified node world, but the discussion
using the term clone, I'm not sure we're on the same page.

Data:

A-+
|
+-B
|
+-C-+
| |
| +-F
|
+-D-+
|
+-E
|
+-(to C)

Alternative view, same data (this is a DAG, just assume
arrowheads as needed):

A-+
|
+-B
|
+--------------+
| |
| |
| |
+-D-+ +-C-+
| | |
+-E | +-F
| |
+----------+

Leo tree widget presentation:

A-+
|
+-B
|
+-C*-+
| |
| +-F
|
+-D-+
|
+-E
|
+-C*-+
|
+-F

Narrative:

Node C is "cloned". There is only one C node. "*" is the Leo clone
indicator. Node C occurs in the children lists of both A and D. Node
C has two entries in its parents list, i.e. A and D. There is only
one node F, and it has only node C on its parent list. In the
vnode/tnode world you'd say the Fs are joined (I think?), but they're
the same node, so this is redundant.

Cheers -Terry

Edward K. Ream

unread,
Apr 15, 2008, 2:34:02 PM4/15/08
to leo-e...@googlegroups.com
On Tue, Apr 15, 2008 at 1:26 PM, Terry Brown <terry_...@yahoo.com> wrote:


I've been becoming increasingly unsure about the definition of "clone"
in the unified node world.  

Yes, it's tricky.  In the unified world, we are, in effect, saying that exactly the same node can appear in multiple places in an outline.  But the present shared subtrees scheme makes that statement a little less than enlightening.  Sometimes the code is the only way to make sense of what is going on.

Edward

derwisch

unread,
Apr 16, 2008, 9:19:37 AM4/16/08
to leo-editor


On Apr 15, 8:01 pm, "Edward K. Ream" <edream...@gmail.com> wrote:
> That was my first thought too.  But two clones can share the same parent.

Ouch. That's where the DAG analogue breaks, or even the analogue with
graphs in general. Didn't think of that.

In order to maintain the analogy, would it be possible to disallow
"identical twins", so to say? Obviously this would mean getting rid of
the Clone Node function and having "Paste node as clone" as the normal
way of producing clones. Or are there use cases for having multiple
clones as siblings?

On a related note, how would unified-node versions of Leo handle
legacy files? Will there be an import function? If you'd be willing to
make a change taht radical, the import function would then merge
idenical twins to a single node (and give a warning about it).

I am boldly proposing this for three reasons: Out of some kindergarten
spite ("you destroy my card house, so I'm destroying someone else's"),
because I am right now seeing no use for identical twins, and because
I think it is a very good idea to keep the analogy to some very well
described and explored mathematical structure.



Edward K. Ream

unread,
Apr 16, 2008, 9:26:26 AM4/16/08
to leo-e...@googlegroups.com
On Wed, Apr 16, 2008 at 8:19 AM, derwisch <johannes...@med.uni-heidelberg.de> wrote:

On Apr 15, 8:01 pm, "Edward K. Ream" <edream...@gmail.com> wrote:
> That was my first thought too.  But two clones can share the same parent.

Ouch. That's where the DAG analogue breaks, or even the analogue with
graphs in general. Didn't think of that.

In order to maintain the analogy, would it be possible to disallow
"identical twins", so to say? Obviously this would mean getting rid of
the Clone Node function and having "Paste node as clone" as the normal
way of producing clones. Or are there use cases for having multiple
clones as siblings?

Almost anything is possible, but I'm not going to consider doing this.

On a related note, how would unified-node versions of Leo handle
legacy files? Will there be an import function? If you'd be willing to
make a change taht radical, the import function would then merge
idenical twins to a single node (and give a warning about it).

The unified-node version of Leo isn't going to happen.  The reason I shall finish the code is because very similar code could be used in the graph world with the present vnode/tnode scheme.  Debugging now, while the code is fresh in my mind, will save time later.

I'm not sure in detail how the unified-node version of Leo will handle legacy files.  The sax read code will, of course, remain basically unchanged: what will change is the post-parser code that converts the sax-nodes into Leo nodes.  It has been my experience that this conversion process is straightforward.
 

I am boldly proposing this for three reasons: Out of some kindergarten
spite ("you destroy my card house, so I'm destroying someone else's"),
because I am right now seeing no use for identical twins, and because
I think it is a very good idea to keep the analogy to some very well
described and explored mathematical structure.

Your house will remain intact for the foreseeable future.  Whatever happens to the Leo's file format, it will support legacy formats.

Edward

derwisch

unread,
Apr 16, 2008, 9:37:28 AM4/16/08
to leo-editor


On Apr 16, 3:26 pm, "Edward K. Ream" <edream...@gmail.com> wrote:
> Your house will remain intact for the foreseeable future.  Whatever happens
> to the Leo's file format, it will support legacy formats.

I am feeling like having parried a match ball served by Andy Roddick.

Once your plans have become more concrete, could you draw up the data
structure you are getting at?
http://groups.google.com/group/leo-editor/attach/6b67112ede526575/leoStruct.png?hl=en&part=2&view=1
is most helpful.

Terry Brown

unread,
Apr 16, 2008, 10:08:45 AM4/16/08
to leo-e...@googlegroups.com
On Wed, 16 Apr 2008 06:19:37 -0700 (PDT)
derwisch <johannes...@med.uni-heidelberg.de> wrote:

> > That was my first thought too.  But two clones can share the same
> > parent.
>
> Ouch. That's where the DAG analogue breaks, or even the analogue with
> graphs in general. Didn't think of that.
>
> In order to maintain the analogy, would it be possible to disallow
> "identical twins", so to say? Obviously this would mean getting rid of
> the Clone Node function and having "Paste node as clone" as the normal
> way of producing clones.

Ah, now I understand the "two clones can share the same parent" bit, I
didn't get it until now. But is there anything, in the most general of
generalized graphs, that says you can't have two links between nodes?
It seems unified node world could just have B appear in A's children
list twice. Obviously there a limited reasons for doing this,
basically to support the usually transient state created by the Clone
Node function, and maybe to make an entry visible at two places in a
long list. But I don't see how it's a problem - it's working now,
basically.

I think it's a shame unified node world isn't happening, because I
don't think the v/tnode system has any advantages and I think the
v/tnode system is much less intuitive.

Of course, if we get a sane p.stack and vnodes get parents and children
iterators that yield other vnodes we're basically there, is that how
things are going to fall out?

Cheers -Terry

Edward K. Ream

unread,
Apr 16, 2008, 11:01:12 AM4/16/08
to leo-e...@googlegroups.com
On Wed, Apr 16, 2008 at 9:08 AM, Terry Brown <terry_...@yahoo.com> wrote:
 

Ah, now I understand the "two clones can share the same parent" bit, I
didn't get it until now.  But is there anything, in the most general of
generalized graphs, that says you can't have two links between nodes?

Not to my knowledge: general means general :-)  However, afaik clones are not a part of traditional graph "lore".  That is, we could say that clones simply are not an issue: any node could have multiple links into that node.  But if we wanted to represent clones directly in a graph, we would need a notation that indicates that two apparently separate nodes are clones.  Something like A == A.

So this means that clones are something pretty special to Leo.

It seems unified node world could just have B appear in A's children
list twice.  Obviously there a limited reasons for doing this,
basically to support the usually transient state created by the Clone
Node function, and maybe to make an entry visible at two places in a
long list.  But I don't see how it's a problem - it's working now,
basically.

Correct.

I think it's a shame unified node world isn't happening, because I
don't think the v/tnode system has any advantages and I think the
v/tnode system is much less intuitive.

I am more sanguine.  The recent "little" aha says that tnodes should be considered subsidiary.  In other words, most users should be able to pretend they don't exist.  That's not quite the case now.  To make that a reality we could a) retire or deprecate the tnode iters, b) ensure that all tnode getters/setters have analogs in the position and/or vnode classes.  With these changes in place, tnodes would be relegated to "hidden helper" status.

Of course, if we get a sane p.stack and vnodes get parents and children
iterators that yield other vnodes we're basically there, is that how
things are going to fall out?

Yes, a "sane" p.stack turns out to path to the graph world, and to much simpler code in leoNodes.py.  You could call this one of the most important fallouts of all the recent chit-chat :-)

However, as much as I like simple code, I shall be in no hurry to make this massive change to Leo's internals.  It's not needed now because the graph world has lower priority than any of the present 4.5 projects.

Edward

thyrsus

unread,
Apr 16, 2008, 11:10:23 AM4/16/08
to leo-editor
When two clones have the same parent (e.g., immediately after the
clone node operation) wouldn't they list the same parent twice in the
parent list?

- Stephen

On Apr 16, 9:19 am, derwisch <johannes.hues...@med.uni-heidelberg.de>
wrote:

Edward K. Ream

unread,
Apr 16, 2008, 11:23:52 AM4/16/08
to leo-e...@googlegroups.com
On Wed, Apr 16, 2008 at 10:10 AM, thyrsus <ssch...@acm.org> wrote:

When two clones have the same parent (e.g., immediately after the
clone node operation) wouldn't they list the same parent twice in the
parent list?

Not at present.   At present, the code that sets and uses n.parents is buggy, and I'm not sure what it is supposed to do.  The n.clonesList ivar is now the way to detect clonedness.  I'm not even sure whether the .parents ivar is need.  It's useless to ask more question until I understand the code better myself :-)

Edward

derwisch

unread,
Apr 16, 2008, 11:56:32 AM4/16/08
to leo-editor


On Apr 16, 5:01 pm, "Edward K. Ream" <edream...@gmail.com> wrote:
> On Wed, Apr 16, 2008 at 9:08 AM, Terry Brown <terry_n_br...@yahoo.com>
> wrote:
> > Ah, now I understand the "two clones can share the same parent" bit, I
> > didn't get it until now.  But is there anything, in the most general of
> > generalized graphs, that says you can't have two links between nodes?
>
> Not to my knowledge: general means general :-)

Sure there are generalisations that allow multiple edges between the
same node, but you lose some properties, for instance to have the
graph represented by an adjacency matrix.

On the other hand, Leo's structure is more complex than graphs, for
instance children of nodes maintain the information of their order.

Note that I am not looking at the actual code, but the children of a
parent in a graph could be represented by a dictionary or an unordered
collection, not an ordered list.

>  However, afaik clones are
> not a part of traditional graph "lore".  That is, we could say that clones
> simply are not an issue: any node could have multiple links into that node.
> But if we wanted to represent clones directly in a graph, we would need a
> notation that indicates that two apparently separate nodes are clones.
> Something like A == A.

My view of clones is that they are the same entity, only referenced
from different position. The distinction between tnode uA's and vnode
uA's would be the distinction between annotating vertices and
annotating edges, both of which are a frequent generalisation of
classic graphs.

>
> So this means that clones are something pretty special to Leo.
>

No.

> > It seems unified node world could just have B appear in A's children
> > list twice.  Obviously there a limited reasons for doing this,
> > basically to support the usually transient state created by the Clone
> > Node function, and maybe to make an entry visible at two places in a
> > long list.  But I don't see how it's a problem - it's working now,
> > basically.
>
> Correct.

As long as the notion of order is preserved, there is a clear
distinction. If the list of parents contained tuples of (parent, order
number), you could mark identical twins as distinct according to those
tuples and you'd have a list of length > 1 and to come back to the
original question, a reason to set the clone mark.

>
> > I think it's a shame unified node world isn't happening, because I
> > don't think the v/tnode system has any advantages and I think the
> > v/tnode system is much less intuitive.
>
> I am more sanguine.  The recent "little" aha says that tnodes should be
> considered subsidiary.  In other words, most users should be able to pretend
> they don't exist.  That's not quite the case now.  To make that a reality we
> could a) retire or deprecate the tnode iters, b) ensure that all tnode
> getters/setters have analogs in the position and/or vnode classes.  With
> these changes in place, tnodes would be relegated to "hidden helper" status.
>

This sounds to me as if you'll eventually drop tnode uA's. I am
assuming an alert position again.

Edward K. Ream

unread,
Apr 16, 2008, 12:28:35 PM4/16/08
to leo-e...@googlegroups.com
This sounds to me as if you'll eventually drop tnode uA's. I am
assuming an alert position again.

I'll keep things exactly as they are for now, and will certainly give people plenty of notice if I even start to consider such a change.

Edward

thyrsus

unread,
Apr 17, 2008, 12:57:35 PM4/17/08
to leo-editor
This has the feel of some kind of permission problem. The following
just succeeded, but it looks like it created a new branch named
~sschaefer/trunk, which is apparently separate from ~leo-editor-team/
trunk. Is there something I need to do to join the "leo-editor-team"?

- Stephen

thyrsus

unread,
Apr 17, 2008, 1:01:46 PM4/17/08
to leo-editor
Forgot to include the line I used:

bash-3.2$ bzr push bzr+ssh://ssch...@bazaar.launchpad.net/~sschaefer/
leo-editor/trunk
Created new
branch.
[sps@thyrsus-laptop ~]$

Terry Brown

unread,
Apr 17, 2008, 1:29:28 PM4/17/08
to leo-e...@googlegroups.com
On Thu, 17 Apr 2008 09:57:35 -0700 (PDT)
thyrsus <ssch...@acm.org> wrote:

>
> This has the feel of some kind of permission problem. The following
> just succeeded, but it looks like it created a new branch named
> ~sschaefer/trunk, which is apparently separate from ~leo-editor-team/
> trunk. Is there something I need to do to join the "leo-editor-team"?

https://launchpad.net/~leo-editor-team

Should be a join team button there I think.

Cheers -Terry

thyrsus

unread,
Apr 17, 2008, 3:00:48 PM4/17/08
to leo-editor
That did it, thanks! The changes are now in the trunk.

- Stephen

On Apr 17, 1:29 pm, Terry Brown <terry_n_br...@yahoo.com> wrote:
> On Thu, 17 Apr 2008 09:57:35 -0700 (PDT)
>

Edward K. Ream

unread,
Apr 18, 2008, 9:09:28 AM4/18/08
to leo-e...@googlegroups.com
On Thu, Apr 17, 2008 at 2:00 PM, thyrsus <ssch...@acm.org> wrote:

That did it, thanks!  The changes are now in the trunk.

I merged these changes into my trunk.  All unit tests pass. I like this code a lot. 

I am considering eliminating the lambda arguments. True, the present code is elegant, but it creates one or more function call for every position returned.  So I may optimize the code for speed by 'hard coding' the values returned by the lambda's into multiple copies of the code.  This should make the new iters faster than the old for all outlines.

Do you have any objections to this plan?

Edward

thyrsus

unread,
Apr 18, 2008, 10:13:29 AM4/18/08
to leo-editor
I have only the mildest of aesthetic objections. My timings from
before and after introducing the lambda arguments differed only in the
"noise" range. On the other hand, when I tried using psyco, it
crashed very badly, and psyco has recently had problems with lambdas,
so if getting rid of the lambdas allowed psyco to work, that would be
an overwhelming reason to remove the lambdas.

- Stephen

On Apr 18, 9:09 am, "Edward K. Ream" <edream...@gmail.com> wrote:

Edward K. Ream

unread,
Apr 18, 2008, 2:51:47 PM4/18/08
to leo-e...@googlegroups.com
On Fri, Apr 18, 2008 at 9:13 AM, thyrsus <ssch...@acm.org> wrote:

I have only the mildest of aesthetic objections.  My timings from
before and after introducing the lambda arguments differed only in the "noise" range.  On the other hand, when I tried using psyco, it
crashed very badly, and psyco has recently had problems with lambdas, so if getting rid of the lambdas allowed psyco to work, that would be an overwhelming reason to remove the lambdas.

Thanks for this reply.  I'm not sure what I am going to do with this info: eventually I'd like to remove the distinction between tnode and vnode iters by using the unique keyword arg. But there is so much happening now that I'm happy to wait...

Edward

Ville M. Vainio

unread,
Jun 17, 2009, 5:29:08 AM6/17/09
to leo-e...@googlegroups.com
On Tue, Feb 26, 2008 at 2:14 AM, Terry Brown<terry_...@yahoo.com> wrote:

> What I struggle with is that some attributes should be universal for
> the (t)node in any context, and others only make sense in some contexts.
> For example you might color code the backgrounds of cloned nodes in an
> active task list to indicate urgency.  But when you're looking at the
> node in it's "primary" location you don't want to be distracted with
> "why is that one node pink?" type noise.

(*bump*)

I am replying to this old thread because it's important.

As it appears, most of the advantages of vnode uAs (and vnodes in
general) seem to be misunderstandigs (i.e. assumption that vnode is
somehow like position, i.e. tree position specific).

--
Ville M. Vainio
http://tinyurl.com/vainio

Edward K. Ream

unread,
Jun 17, 2009, 9:02:42 AM6/17/09
to leo-e...@googlegroups.com
On Wed, Jun 17, 2009 at 4:29 AM, Ville M. Vainio <viva...@gmail.com> wrote:

On Tue, Feb 26, 2008 at 2:14 AM, Terry Brown<terry_...@yahoo.com> wrote:

> What I struggle with is that some attributes should be universal for
> the (t)node in any context, and others only make sense in some contexts.
> For example you might color code the backgrounds of cloned nodes in an
> active task list to indicate urgency.  But when you're looking at the
> node in it's "primary" location you don't want to be distracted with
> "why is that one node pink?" type noise.

 (*bump*)

I'm beginning to agree that unifying nodes will be clearer.

BTW, there is never any such thing as a "primary" location.  All clones are exactly equivalent.

But to agree with you on your fundamental point: eliminating the distinction between vnodes and tnodes in Leo's core should help.

For example, I just now did a little research to determine why clones work in the unified scheme :-)  My first thought was that cloned nodes must be distinct in the unified scheme just as they are in the non-unified scheme.  The bogus "proof" was that the nodes will have different v.parents entries.  In fact, though, p.clone updates v.parents for the node, so cloning a node does *not* create a new node in the unified scheme, but only updates v.parents.

So while eliminating the distinction between vnodes and tnodes should help in the long run, it will take me awhile to switch mental gears :-)  To me, having tnodes be the unit of sharing is second nature.

Edward

Ville M. Vainio

unread,
Jun 17, 2009, 12:45:23 PM6/17/09
to leo-e...@googlegroups.com
Doing some more archaeology...

On Tue, Feb 26, 2008 at 9:16 PM, Edward K. Ream<edre...@gmail.com> wrote:

>> This seems like something of an oversimplification to me.  The topology
>> of vnodes defines the context in which your referencing a tnode.
>
> Yes, you are correct.  I misspoke.  Vnodes *used to* correspond to nodes on
> the screen, but that is ancient history.

Ah, so this explains existence of vnodes - they were analogous to
positions before , but this analogy was removed at some point
(effectively making vnodes redundant).

Edward K. Ream

unread,
Jun 17, 2009, 9:04:36 PM6/17/09
to leo-e...@googlegroups.com

Yes, you could put it that way.  The  key picture was fingers holding beans.  The fingers are positions; the beans; vnodes.  This allows the *combination* of positions and vnodes to do what was previously done by vnodes and tnodes.  Thus, the unified node world was born.

Edward

Reply all
Reply to author
Forward
0 new messages