Would this solution be important? Is it already solved?

Mountain

unread,

Apr 24, 2006, 10:56:52 AM4/24/06

to

I would like opinions on whether this solution could be important and
whether anyone knows if it is already solved. (I would also like to
know if anyone thinks it cannot be solved.)

First, we start with a general tree (any of the trees that my data
structures book calls rooted trees or unordered trees). Then we are
given two specific nodes (Na and Nb) that belong to that tree.

We would like to know if the second node (Nb) is an ancestor of the
first node (Na).

My question to the group is this:

If there was an algorithm for answering this question _without_
traversing the tree or doing any type of search or any
iterative/recursive operations, how important and useful would that be?
Obviously, it would save a lot of computing time.

Is there a known solution that does this? Or is there a proof that it
cannot be done?

Are there any citations I should be looking at? Any help is appreciated.

Willem

unread,

Apr 24, 2006, 11:08:09 AM4/24/06

to

Mountain wrote:
) I would like opinions on whether this solution could be important and
) whether anyone knows if it is already solved. (I would also like to
) know if anyone thinks it cannot be solved.)
)
) First, we start with a general tree (any of the trees that my data
) structures book calls rooted trees or unordered trees). Then we are
) given two specific nodes (Na and Nb) that belong to that tree.
)
) We would like to know if the second node (Nb) is an ancestor of the
) first node (Na).
)
) My question to the group is this:
)
) If there was an algorithm for answering this question _without_
) traversing the tree or doing any type of search or any
) iterative/recursive operations, how important and useful would that be?
) Obviously, it would save a lot of computing time.

I think you're going to have to specify in a lot more detail what is and
is not allowed. For example, 'no iterative/recursive operations' excludes
all algorithms possible, except maybe hello world.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

wka...@yahoo.com

unread,

Apr 24, 2006, 11:17:07 AM4/24/06

to

If I were faced with this problem (hasn't happen so far in 20+ years of
programming), I would either implement the tree with parent pointers,
or maybe store the tree in an array where the entries whose indexes
are 2N and 2N+1 are the children in the tree of the entry whose array
index is N. But both of these approaches have drawbacks, so another
approach would be worth looking at. Don't think you'd be able to
retire
in Tahiti on the patent royalties, though.

Alf P. Steinbach

unread,

Apr 24, 2006, 11:28:13 AM4/24/06

to

* Mountain:

> I would like opinions on whether this solution could be important and
> whether anyone knows if it is already solved. (I would also like to
> know if anyone thinks it cannot be solved.)
>
> First, we start with a general tree (any of the trees that my data
> structures book calls rooted trees or unordered trees). Then we are
> given two specific nodes (Na and Nb) that belong to that tree.
>
> We would like to know if the second node (Nb) is an ancestor of the
> first node (Na).
>
> My question to the group is this:
>
> If there was an algorithm for answering this question _without_
> traversing the tree or doing any type of search or any
> iterative/recursive operations, how important and useful would that be?

Not very useful, I think, since for a balanced tree the time to traverse
all the way up to the root is O(log n), where n is the number of nodes.

> Obviously, it would save a lot of computing time.

Some cases of Nb actually /not/ an ancestor of Na, for arbitrary Nb and
Na, can easily be determined in constant time.

AFAIK you can not determine in constant time that Nb /is/ an ancestor
node, in the general case.

However, for specific cases with limited number of nodes you can do
things like that. For example, for a balanced in-memory binary tree the
path to any node can be represented as a sequence of bits. With 32-bit
memory addresses this sequence will never be 32 bits or larger (the
nodes wouldn't fit in available memory), and so a simple check of common
start sequence, just a few machine code instructions, determines
ancestorship.

> Is there a known solution that does this?

See above.

> Or is there a proof that it cannot be done?

For the general case the problem is to determine whether a path A of
N-way choices is a proper left substring of another such path B, with
the paths chosen arbitrarily.

If path A is longer or of equal length than B, then A cannot be a proper
left substring of B; that's one of the cases of "not ancestor".

If path A's first element is different from path B's first element, then
A cannot be a proper left substring of B; that's another such case.

There may be other such negative cases.

Otherwise, to check this in constant time, one would have to encode any
path in a unique small value or fixed small set of values, such that
those encodings could be compared for left-substring relationship in
constant time, and I don't think that's possible.

> Are there any citations I should be looking at? Any help is appreciated.

Google.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Tom Widmer

unread,

Apr 24, 2006, 11:30:44 AM4/24/06

to

Mountain wrote:
> I would like opinions on whether this solution could be important and
> whether anyone knows if it is already solved. (I would also like to
> know if anyone thinks it cannot be solved.)
>
> First, we start with a general tree (any of the trees that my data
> structures book calls rooted trees or unordered trees). Then we are
> given two specific nodes (Na and Nb) that belong to that tree.

Does a node contain a pointer to its parent? Define exactly what your
data structure looks like.

> We would like to know if the second node (Nb) is an ancestor of the
> first node (Na).
>
> My question to the group is this:
>
> If there was an algorithm for answering this question _without_
> traversing the tree or doing any type of search or any
> iterative/recursive operations, how important and useful would that be?
> Obviously, it would save a lot of computing time.

Well, if you've got a parent pointer in each node it is a trivial
O(depth) algorithm. If you don't have parent pointers, then an algorithm
that is better than O(n) might be of interest.

> Is there a known solution that does this? Or is there a proof that it
> cannot be done?

It would be useful if you accurately defined the problem you are trying
to solve first, otherwise a proof is impossible.

Tom

Mountain

unread,

Apr 24, 2006, 11:47:30 AM4/24/06

to

Here is some more detail:
Traversing the tree node-by-node is not allowed -- that is obviously a
recursive or iterative operation. The solution should use only the two
nodes that are presented (what I called nodes Na and Nb) to determine
if Nb is an ancestor of Na. The solution must be general - it should
work on any of the types of trees I mentioned.

The solution should be quick - ideally just a very simple/fast
computation of some type. It cannot be a search. This means the
solution cannot involve searching the tree itself, nor can it involve
search a list inside each node, for example. It should not involve any
additional storage - meaning it cannot rely upon something like an
external index (the utilization of which would effectively be a search
anyway) or extra data related to each node.

The solution should be fundamentally simple, quick and not a gimmick.

Mountain

unread,

Apr 24, 2006, 11:59:21 AM4/24/06

to

Hello Alf P. Steinbach. Thank you very much. Your reply was helpful. It
sounds like the solution I have in mind may not be completely unknown
or original. I'll work with it a bit more and see if it indeed
generalizes better or have any other especially redeeming qualities ;)

Mountain

unread,

Apr 24, 2006, 12:01:47 PM4/24/06

to

>Don't think you'd be able to retire in Tahiti on the patent royalties, though.
>

Hehe. You're surely right on that! But it also doesn't sound like I'll
even be able to get a good paper or thesis out of it! :(

Ed Prochak

unread,

Apr 24, 2006, 12:10:28 PM4/24/06

to

Google in comp.databases for Joe Chelko's nested sets solution for
implementing hierarchical data in a relational database. I think that
data structure will fit your need. It's in his book TREES & HIERARCHIES
IN SQL.
(so if you came up with the same solution, good bye patent.)

HTH,
Ed

Willem

unread,

Apr 24, 2006, 12:34:50 PM4/24/06

to

Mountain wrote:
) Here is some more detail:
) Traversing the tree node-by-node is not allowed -- that is obviously a
) recursive or iterative operation. The solution should use only the two
) nodes that are presented (what I called nodes Na and Nb) to determine
) if Nb is an ancestor of Na. The solution must be general - it should
) work on any of the types of trees I mentioned.
)
) The solution should be quick - ideally just a very simple/fast
) computation of some type. It cannot be a search. This means the
) solution cannot involve searching the tree itself, nor can it involve
) search a list inside each node, for example. It should not involve any
) additional storage - meaning it cannot rely upon something like an
) external index (the utilization of which would effectively be a search
) anyway) or extra data related to each node.

How many children can a node have ?

And where did you get this problem from ?

wka...@yahoo.com

unread,

Apr 24, 2006, 12:45:41 PM4/24/06

to

Alf P. Steinbach wrote:
...

> However, for specific cases with limited number of nodes you can do
> things like that. For example, for a balanced in-memory binary tree the
> path to any node can be represented as a sequence of bits. With 32-bit
> memory addresses this sequence will never be 32 bits or larger (the
> nodes wouldn't fit in available memory), and so a simple check of common
> start sequence, just a few machine code instructions, determines
> ancestorship.

...

Another limited solution is to label each leaf node with a prime
number,
and label all other nodes with the product of the labels of its
children.
The limitations are the bits of precision of the label (in the root
node, must hold p(1) * p(2) * ... * p(n), n the number of tree nodes,
p(i) the i'th prime), and that each node must have at least 2 children.

Alf P. Steinbach

unread,

Apr 24, 2006, 1:03:18 PM4/24/06

to

* Mountain:

One thing to think of is the conflict with rebalancing the tree.

There's a trade-off in just about everything.

Arthur J. O'Dwyer

unread,

Apr 24, 2006, 1:49:25 PM4/24/06

to

[Followups set to c.p.]

On Mon, 24 Apr 2006, Mountain wrote:
>
> I would like opinions on whether this solution could be important and
> whether anyone knows if it is already solved. (I would also like to
> know if anyone thinks it cannot be solved.)
>
> First, we start with a general tree (any of the trees that my data
> structures book calls rooted trees or unordered trees). Then we are
> given two specific nodes (Na and Nb) that belong to that tree.
>
> We would like to know if the second node (Nb) is an ancestor of the
> first node (Na).

Here's my attempt at a summary of what's been said in the various
subthreads, as much for my benefit as yours. :)

The problem doesn't seem terribly useful, although it is useful for
hierarchies, such as the example given in
http://www.barneyb.com/blog/archives/000570.jsp

food
/ \
meat fruit
| / \
beef apple pear

If we can solve the "is-ancestor-of" problem in O(1), then we can answer
questions such as "Is beef food?" and "Is an apple a fruit?" This could
be useful in typechecking object-oriented languages that use a lot of
subtyping and inheritance.

So, how can we implement this? Joe Celko's "nested sets" solution
looks like this:

1.food.12
/ \
2.meat.5 6.fruit.11
| / \
3.beef.4 7.apple.8 9.pear.10

Each node has two fields "left" and "right", such that
parent.left < child.left and child.right < parent.right. Then checking
that Na is a descendant of Nb simply means checking that

(Nb.left < Na.left && Na.right < Nb.right)

This solution is easy to implement, and deleting a node is easy, but
adding new nodes is naively an O(n) operation, since the "left" and
"right" fields of about half the tree's nodes need to be updated.
However, for the OO-typechecking application, lookups are going to be
much more common than inserts, so we win.
Googling "Joe Celko nested sets" does indeed turn up useful information
on this method.

Another method was proposed by Alf Steinbach in this thread: In each
node, instead of a parent pointer, store the path from the root itself,
represented as a sequence of "left" and "right" directions. For example,
the path stored in the "apple" node above would be "right,left". (Note
that this method only really works for binary trees, or with extensions,
N-ary trees. It doesn't seem applicable to general rooted trees.)
Then, checking whether Na is a descendant of Nb boils down to checking
whether Nb.path is a prefix of Na.path, which can be done in O(lg N) time
naively, and is probably possible in much less time using bitwise
arithmetic. (See below.)
As Alf points out, O(lg N) is basically O(1) on a machine with N-bit
addressing. (lg 32 is just a constant.) This means that the real-world
solution of simply storing a parent pointer in each node is basically
constant-time, too --- but the stored-path solution saves time if
dereferencing is expensive.

A third approach, which boils down to Alf's solution, is to notice that
iff Nb is an ancestor of Na, then the lowest common ancestor of Na and Nb
is Nb itself. Let's number the nodes in the order of an in-order traversal
on a complete binary tree, like this:

food.100
/ \
meat.010 fruit.110
/ / \
beef.001 apple.101 pear.111

Notice that Nx.number can also be described as "the path from the root
to Nx, followed by a 1 and right-padded with zeros." Then the paths of Na
and Nb diverge at the leftmost 1 bit of (Na.number XOR Nb.number), and
we can get their lowest common ancestor by computing

/* suppose Na = apple, Nb = fruit */
xor = Na.number ^ Nb.number; /* xor = 011 */
left = find_leftmost_one(xor); /* left = 010 */
pathmask = ~(2*left-1); /* pathmask = 100 */
path = (Na.number & pathmask) | left; /* path = 110 */

Now, since path == Nb.number, we see that indeed, Nb is the lowest common
ancestor of Na and Nb.
Unfortunately, actually /finding/ the leftmost 1 bit in a word isn't
possible in constant time using only the standard C operators. I think
Java provides a library function to do it quickly; and there is a hack
involving the FPU (convert to 'double', examine the exponent field) that
would work if you /really/ needed speed at the cost of portability.

More information on the "lowest common ancestor" algorithm is available
here: http://homepage.usask.ca/~ctl271/810/lowest_common_ancestor.pdf
The find-lowest-one FPU hack is documented here:
http://www.diku.dk/hjemmesider/ansatte/jyrki/Course/Performance-Engineering-1998/

HTH,
-Arthur

Martin Vejnár

unread,

Apr 24, 2006, 1:42:17 PM4/24/06

to

A better solution might be to tag all nodes with a prime number and with
a product of all its (direct and indirect) parents. You can then check
whether (the supposed parent)'s product divides (the supposed child)'s
product.

However, with 32-bit integers, this can be implemented only for very
small trees unless you use some sort of arbitrary precision library,
which would, however, probably count as an iteration.

--
Martin

Arthur J. O'Dwyer

unread,

Apr 24, 2006, 2:11:13 PM4/24/06

to

[Followups set to c.p.]

On Mon, 24 Apr 2006, [ISO-8859-1] Martin Vejnár wrote:
> wka...@yahoo.com wrote:
>> Alf P. Steinbach wrote:
>> ...
>>> However, for specific cases with limited number of nodes you can do
>>> things like that. For example, for a balanced in-memory binary tree the
>>> path to any node can be represented as a sequence of bits. With 32-bit
>>> memory addresses this sequence will never be 32 bits or larger (the
>>> nodes wouldn't fit in available memory), and so a simple check of common
>>> start sequence, just a few machine code instructions, determines
>>> ancestorship.
>> ...
>>
>> Another limited solution is to label each leaf node with a prime
>> number, and label all other nodes with the product of the labels of its
>> children.
>>
>> The limitations are the bits of precision of the label (in the root
>> node, must hold p(1) * p(2) * ... * p(n), n the number of tree nodes,
>> p(i) the i'th prime), and that each node must have at least 2 children.
>
> A better solution might be to tag all nodes with a prime number and with a
> product of all its (direct and indirect) parents. You can then check whether
> (the supposed parent)'s product divides (the supposed child)'s product.

The "better solution" is exactly the same as the original solution,
except that you're requiring about twice as many prime numbers since
you tag the internal nodes as well as the leaves. And that cuts down the
size of the tree by a quadratic factor at least.
Observe:

food.30
/ \
meat.2 fruit.15
| / \
beef.2 apple.3 pear.5

Only the leaf nodes need prime tags. Each internal node gets the product
of its children's tags.

> However, with 32-bit integers, this can be implemented only for very small
> trees

Using the original solution (using only as many primes as leaves), you
can get nine leaf nodes before overflowing the root node's 32-bit product.
Yeah, I'd say that's a very small tree. :) On the other hand, you can
have as many one-child internal nodes as you like, with no overhead ---
something the stored-path methods don't give you. So maybe this method
could find some highly specialized application. I'm not ruling it out.

-Arthur

Willem

unread,

Apr 24, 2006, 2:28:02 PM4/24/06

to

Arthur wrote:
) The "better solution" is exactly the same as the original solution,
) except that you're requiring about twice as many prime numbers since
) you tag the internal nodes as well as the leaves. And that cuts down the
) size of the tree by a quadratic factor at least.
) Observe:
)
) food.30
) / \
) meat.2 fruit.15
) | / \
) beef.2 apple.3 pear.5
)
) Only the leaf nodes need prime tags. Each internal node gets the product
) of its children's tags.
)
)> However, with 32-bit integers, this can be implemented only for very small
)> trees
)
) Using the original solution (using only as many primes as leaves), you
) can get nine leaf nodes before overflowing the root node's 32-bit product.
) Yeah, I'd say that's a very small tree. :) On the other hand, you can
) have as many one-child internal nodes as you like, with no overhead ---
) something the stored-path methods don't give you. So maybe this method
) could find some highly specialized application. I'm not ruling it out.

Why use prime factors, why not just use bit masks ?
Each leaf gets assigned a bit, that way you can have 32 leaves instead of 9.

food.111
/ \
meat.001 fruit.110
| / \
beef.001 apple.010 pear.100

Martin Vejnár

unread,

Apr 24, 2006, 3:48:26 PM4/24/06

to

Arthur J. O'Dwyer wrote:
> Martin Vejnár wrote:
>> wka...@yahoo.com wrote:
>>> Another limited solution is to label each leaf node with a prime
>>> number, and label all other nodes with the product of the labels of its
>>> children.
>>>
>>> The limitations are the bits of precision of the label (in the root
>>> node, must hold p(1) * p(2) * ... * p(n), n the number of tree nodes,
>>> p(i) the i'th prime), and that each node must have at least 2 children.
>>
>> A better solution might be to tag all nodes with a prime number and
>> with a product of all its (direct and indirect) parents. You can then
>> check whether (the supposed parent)'s product divides (the supposed
>> child)'s product.
>
> The "better solution" is exactly the same as the original solution,
> except that you're requiring about twice as many prime numbers since
> you tag the internal nodes as well as the leaves. And that cuts down the
> size of the tree by a quadratic factor at least.

You're right, of course. For some reason, I tried to remove the "each
node must have at least 2 children" limitation, while there was none. My
bad :)

--
Martin

James Lehman

unread,

Apr 24, 2006, 5:30:58 PM4/24/06

to

If each node maintains a linked list of pointers to its ancestors, then a
comparison of the first least number of elements in Na and Nb would either
be the same or different. Although that requires iteration or recursion.
Another way is to come up with some kind of character string code that more
or less does the same thing. But again, a string of characters requires
iteration at some point to read it or compare it to another. There really
isn't much of anything that a computer can do without at least some
iteration.

James. :o)

"Mountain" <dsh...@yahoo.com> wrote in message
news:1145890612.3...@g10g2000cwb.googlegroups.com...

CTips

unread,

Apr 24, 2006, 6:11:10 PM4/24/06

to

Use DFS pre and post order numbers. IIRC, Na is an ancestor of Nb iff
pre(Na) < pre(Nb) and post(Na) > post(Nb). It should be in any
data-structures book.

Mountain

unread,

Apr 24, 2006, 7:11:13 PM4/24/06

to

Arthur J. O'Dwyer wrote:

> [Followups set to c.p.]

>
> Unfortunately, actually /finding/ the leftmost 1 bit in a word isn't
> possible in constant time using only the standard C operators.

What about just taking the base 2 log and casting the result to an int?
That's constant time, right?

3rd...@willets.org

unread,

Apr 24, 2006, 7:28:40 PM4/24/06

to

See Harel and Tarjan for information on the constant-time lca algorithm.

wka...@yahoo.com

unread,

Apr 24, 2006, 9:04:08 PM4/24/06

to

Strictly speaking, you can't get rid of this limitation so easily. If
the two
nodes that you have to determine ancestry for have the same label,
you'd
have to do some iteration (potentially) to determine who is who's
ancestor.

A slightly better solution is to give each leaf a unique mask of one
bit,
with the parent mask being or'ed together from its children. Still
need
2 children per parent min., but now you can have 32 leaves.

Mountain

unread,

Apr 24, 2006, 9:55:19 PM4/24/06

to

Arthur J. O'Dwyer wrote:

> [Followups set to c.p.]

>
> Another method was proposed by Alf Steinbach in this thread: In each
> node, instead of a parent pointer, store the path from the root itself,
> represented as a sequence of "left" and "right" directions. For example,
> the path stored in the "apple" node above would be "right,left". (Note
> that this method only really works for binary trees, or with extensions,
> N-ary trees. It doesn't seem applicable to general rooted trees.)

This solution actually is applicable to general rooted trees because
they can be transformed into a binary tree easily in a preprocessing
step (with well-known algorithms).

Transformation into binary trees is the approach I was taking. Even
after transformation into a binary tree, the answer that is found (in
constant time) is applicable to the original tree.

I am of the opinion that the solutions discussed here are very useful
in a lot of areas. The area I had in mind is security authorization. In
a Gatekeeper pattern authorization checks might be done at a lot of
checkpoints between a request and a protected resource, and multiple
hierarchies may need to be searched before authorization can be made.
Being able to peform those hierarchical authorization checks
efficiently (without traversing a tree) means that software can make
better use of authorization and can do it in more places in the code.

On another point, is there any fundamental difference between the
'nested sets' solution and using a pre-order + post-order comparison
(as mentioned by "CTips")? They seem roughly equivalent to me.

Arthur J. O'Dwyer

unread,

Apr 25, 2006, 8:27:16 AM4/25/06

to

On Mon, 24 Apr 2006, Mountain wrote:

> Arthur J. O'Dwyer wrote:
>> Another method was proposed by Alf Steinbach in this thread: In each
>> node, instead of a parent pointer, store the path from the root itself,
>> represented as a sequence of "left" and "right" directions. For example,
>> the path stored in the "apple" node above would be "right,left". (Note
>> that this method only really works for binary trees, or with extensions,
>> N-ary trees. It doesn't seem applicable to general rooted trees.)
>
> This solution actually is applicable to general rooted trees because
> they can be transformed into a binary tree easily in a preprocessing
> step (with well-known algorithms).

Hmm... The general-to-binary algorithm with which /I'm/ familiar tips
the tree on its side, turning

food
/ | \
bread meat fruit
| / \
rye apple pear

into

food
/
bread-meat-fruit
| /
rye apple--pear

In this example, using the new tree, "bread" would be an ancestor of
"apple" and "apple" of "pear", neither of which is what we expected.
So this can't be what you mean.
Do you mean just adding dummy nodes to the tree as needed, e.g.

food
/ \
bread D1
| | \
rye meat fruit
/ \
apple pear

That obviously works, but I don't think it's fast or space-efficient.
In the worst case, a completely flat tree, you're doubling the size of
the tree by doing that.
(ObPuzzle: By what factor are you expanding the tree in the /average/
case, with this method? By what factor are you increasing the depth of
the tree?)

> I am of the opinion that the solutions discussed here are very useful
> in a lot of areas. The area I had in mind is security authorization. In
> a Gatekeeper pattern authorization checks might be done at a lot of
> checkpoints between a request and a protected resource, and multiple
> hierarchies may need to be searched before authorization can be made.

Yes, I think that's another good application... Well, actually, I'm
not too keen on "security hierarchies" anyway. In practice, aren't there
usually only a few levels? --- even 20 seems excessive to me. And if
you think you need more hierarchical levels than that, it might be a
sign that you should be handing out specific access rights in the first
place --- i.e., instead of

Alice is a Peon who can access her own files.
Bob is a Serf (> Peon) who can copy his own files' access rights.
Larry is a Printer (> Peon) and can access everyone's files.
Sam is a Superuser (> Serf) who can access everyone's files and
copy their access rights.

you would just have explicit lists of capabilities associated with
each user, and not impose any hierarchy. Then you wouldn't need any
efficient check for "is-ancestor-of"; you'd just need an efficient
check for "is-in", as in "Is this capability in Alice's list of
capabilities?" And that's a much more solved problem.

> On another point, is there any fundamental difference between the
> 'nested sets' solution and using a pre-order + post-order comparison
> (as mentioned by "CTips")? They seem roughly equivalent to me.

No difference; they are exactly the same, as far as I can tell.

-Arthur

Googmeister

unread,

Apr 25, 2006, 9:47:54 AM4/25/06

to

Not sure I fully understand your problem, but it sounds
like it could be a special case of "least common ancestor,"
If you're willing to do a linear amount of preprocessing,
you can any answer LCA query in constant time. This
is a famous result of Harel and Tarjan from the 1980s.

Arthur J. O'Dwyer

unread,

Apr 25, 2006, 10:57:33 AM4/25/06

to

On Tue, 25 Apr 2006, Googmeister wrote:
>
> Not sure I fully understand your problem, but it sounds
> like it could be a special case of "least common ancestor,"
> If you're willing to do a linear amount of preprocessing,
> you can any answer LCA query in constant time. This
> is a famous result of Harel and Tarjan from the 1980s.

D. Harel and R. E. Tarjan. Fast algorithms for finding nearest common
ancestors. SIAM J. Comput., 13:338-355, 1984.

This link to their paper works for me.
http://locus.siam.org/fulltext/SICOMP/volume-13/0213024.pdf

They observe that you can do the binary-tree LCA in constant time,
and then give a (really complicated) algorithm for turning general
trees into balanced binary trees in O(n) time. Which is actually
what I was wondering about in my last post in this thread --- I guess
such an algorithm /is/ "well-known" in some circles! :)

Harel and Tarjan also perform an optimization in which they throw
out a lot of intermediate links, pulling a lot of vertices up to become
children of the root, before doing the general-to-binary conversion.
I don't immediately see why, but I haven't read the whole paper yet.

-Arthur

Googmeister

unread,

Apr 25, 2006, 11:55:41 AM4/25/06

to

Arthur J. O'Dwyer wrote:

Here's a simple linear time algorithm.

http://www.ics.uci.edu/~eppstein/261/BenFar-LCA-00.pdf

Also, Tarjan has an "off-line" version that might be relevant
depending on your application.

http://en.wikipedia.org/wiki/Tarjan's_off-line_least_common_ancestors_algorithm

Mountain

unread,

Apr 25, 2006, 12:18:14 PM4/25/06

to

Arthur J. O'Dwyer wrote:
> Yes, I think that's another good application... Well, actually, I'm
> not too keen on "security hierarchies" anyway. In practice, aren't there
> usually only a few levels? --- even 20 seems excessive to me. And if
> you think you need more hierarchical levels than that, it might be a
> sign that you should be handing out specific access rights in the first
> place --- i.e., instead of
>
> Alice is a Peon who can access her own files.
> Bob is a Serf (> Peon) who can copy his own files' access rights.
> Larry is a Printer (> Peon) and can access everyone's files.
> Sam is a Superuser (> Serf) who can access everyone's files and
> copy their access rights.
>
> you would just have explicit lists of capabilities associated with
> each user, and not impose any hierarchy. Then you wouldn't need any
> efficient check for "is-ancestor-of"; you'd just need an efficient
> check for "is-in", as in "Is this capability in Alice's list of
> capabilities?" And that's a much more solved problem.
>

Setting aside the requirements issues that might or might not demand
certain depths of hierarchies, your last comment prompts me to question
you. After this discussion, it seems to me that flattening a hierarchy
and then searching a list is /not/ preferrable.

Now that we know there are at least 3 decent ways (within the context
of this thread) for finding out if NodeB is an ancestor of NodeA in
constant time, why do you feel that searching a list ("is in") is
preferrable? Is there a way to search a list in constant time that I'm
not aware of? For something that /could/ be hierarchical, why flatten
it out (which is a loss of information in my mind) and then search it,
when we have just discussed the fact that we can answer the question
using a tree structure in constant time?

Chris Uppal

unread,

Apr 25, 2006, 1:03:46 PM4/25/06

to

Mountain wrote:

> I would like opinions on whether this solution could be important and
> whether anyone knows if it is already solved. (I would also like to
> know if anyone thinks it cannot be solved.)

If I understand you correctly then one fairly important application (in
the sense that it affects execution time enough for several papers to
have been written and published) is fast runtime sub-type checking in
object-oriented languages.

Given an object, you have it's class immediately, but knowing whether
that class is a subclass of an arbitrary other one is not automatically
O(1).

Google for
fast subtype checking

-- chris

Arthur J. O'Dwyer

unread,

Apr 25, 2006, 1:25:17 PM4/25/06

to

On Tue, 25 Apr 2006, Mountain wrote:
> Arthur J. O'Dwyer wrote:
>> Yes, I think that's another good application... Well, actually, I'm
>> not too keen on "security hierarchies" anyway. In practice, aren't there
>> usually only a few levels? --- even 20 seems excessive to me. And if
>> you think you need more hierarchical levels than that, it might be a
>> sign that you should be handing out specific access rights in the first
>> place --- i.e., instead of
>>
>> Alice is a Peon who can access her own files.
>> Bob is a Serf (> Peon) who can copy his own files' access rights.
>> Larry is a Printer (> Peon) and can access everyone's files.
>> Sam is a Superuser (> Serf) who can access everyone's files and
>> copy their access rights.
>>
>> you would just have explicit lists of capabilities associated with
>> each user, and not impose any hierarchy. Then you wouldn't need any
>> efficient check for "is-ancestor-of"; you'd just need an efficient
>> check for "is-in", as in "Is this capability in Alice's list of
>> capabilities?" And that's a much more solved problem.
>
> Setting aside the requirements issues that might or might not demand
> certain depths of hierarchies,

Yes. "Because my manager said so" is often an excellent reason
to use a hierarchy; I won't dispute that. (Said without sarcasm. :)

> your last comment prompts me to question
> you. After this discussion, it seems to me that flattening a hierarchy
> and then searching a list is /not/ preferrable.
>
> Now that we know there are at least 3 decent ways (within the context
> of this thread) for finding out if NodeB is an ancestor of NodeA in
> constant time, why do you feel that searching a list ("is in") is
> preferrable? Is there a way to search a list in constant time that I'm
> not aware of?

Hashtable. The naive implementation of "hashtable" is "array"; for
example, if you want to find out whether Bob has permission number 42,
just test 'permissions[Bob][42]' against zero. Optimizations to deal
with permissions-almost-nobody-has, permissions-almost-everybody-has,
users-with-all-permissions, and users-with-almost-no-permissions is
left as an exercise. :)

> For something that /could/ be hierarchical, why flatten
> it out (which is a loss of information in my mind)

Well, you're assuming that the information really is hierarchical
in the first place. One of the disadvantages I hoped would be implied
by my "Peon/Serf/Superuser" example is that every time somebody in
a hierarchical system needs a new privilege, you either have to kick
him up to the next-highest privilege level (thus granting him a whole
lot of privileges he doesn't need, which might be a security hole),
or else you have to introduce a brand-new security level, such as
"Printer" or "Serf-plus-one", and shoehorn it into the tree somewhere.
If you use capability lists from the beginning, you wouldn't have
those design problems.

Of course, if your application really /does/ want a hierarchical
design, then I guess we've seen how to do that. I was just expressing
my skepticism that many applications really do want hierarchies.

-Arthur

Mountain

unread,

Apr 25, 2006, 2:18:07 PM4/25/06

to

Arthur J. O'Dwyer wrote:
> On Tue, 25 Apr 2006, Mountain wrote:
> > Is there a way to search a list in constant time that I'm
> > not aware of?
>
> Hashtable.

IIRC, the possibility of collisions means that hashtables are not
always searchable in constant time. Certain data (and hash functions,
etc.) could result in a linear search. Is there some general agreement
that this property of hashtables is overlooked when discussing the
general performance characteristics? I guess we are getting off topic,
but since I haven't been posting in this group before, I'd like to
understand the conventions.

Mountain

unread,

Apr 25, 2006, 3:52:25 PM4/25/06

to

Arthur J. O'Dwyer wrote:
I was just expressing
> my skepticism that many applications really do want hierarchies.
>

I agree. Those are valid real-world considerations.

However, when looking (superficially, so far) at permission models in
some operating systems, it seems Microsoft addressed valid real-world
concerns by implementing the "Active Directory" with "Organizational
Units" that support nesting. My assumption is that customers demanded
the ability to work with hierarchies. (That said, I don't claim to know
much about Windows or AD.)

But Microsoft's AD example, together with the requirements for my
current project, indicated to me that hierarchies cannot be easily
dismissed in the context of authorization.

And if a Gatekeeper security model (or some other model) is used where
multiple authorization checks are performed for a single request, an
efficient hierarchical authorization check seems to be a valid
real-world requirement.

I would certainly like to understand this issue in more detail and to
understand the pros and cons of hierarchical role-based authorization.

wka...@yahoo.com

unread,

Apr 26, 2006, 12:28:35 AM4/26/06

to

Arthur J. O'Dwyer wrote:

> [Followups set to c.p.]
>
> On Mon, 24 Apr 2006, Mountain wrote:

...

> Another method was proposed by Alf Steinbach in this thread: In each
> node, instead of a parent pointer, store the path from the root itself,
> represented as a sequence of "left" and "right" directions. For example,
> the path stored in the "apple" node above would be "right,left". (Note
> that this method only really works for binary trees, or with extensions,
> N-ary trees. It doesn't seem applicable to general rooted trees.)

Or more generally, if the number of children of a node is always
less than or equal to M, let the label N of a node be
M * Np + Nr where Np is the label of the parent (root is 1)
and Nr is the 0-base number of the node among all the
children of its parent. Na is an ancestor of Nb if
Nb / largest power of M less than Nb = Na. This would
require a binary search of a table of the powers of M, so
its O(log Depth). But if M is relatively small, and the tree
is not static, this may be the better solution.

Mountain

unread,

Apr 28, 2006, 3:45:50 PM4/28/06

to

3rd...@willets.org wrote:
> See Harel and Tarjan for information on the constant-time lca algorithm.

Do you happen to have a full reference? I'm getting this:

Your search for +author:Harel, +author:Tarjan did not return any
results (on ACM).

Also, do you happen to know if their work applies to DAGs? Thanks

Mountain

unread,

Apr 28, 2006, 3:58:52 PM4/28/06

to

I found the first part of the question. I still need to know if it
applies to DAGs (and I need to obtain the paper...).

Googmeister

unread,

Apr 28, 2006, 5:04:32 PM4/28/06

to

No.

Note that reachability (transitive closure) on DAGs is no
easier (in theory) than transitive closure on general
digraphs since you can reduce the latter to the former
by contracting the strong components. Thus, being
able to answer reachability queries on DAGs in O(1)
time with only linear preprocessing would require a
significant algorithmic breakthrough.