what's the tie-breaker lemma

Jim Newton

unread,

Jan 30, 2019, 9:56:00 AM1/30/19

to

I have three questions about topological sorting with respect to computing the
class precedence list in CLOS.

In AMOP and in the implementation of closette.lisp, there is a comment on the function
std-tie-breaker-rule, which says:
(There's a lemma that shows that this rule yields a unique result.)

Q1) Which lemma is this? Does anyone know?

It is not obvious to me why the tie breaker restriction in the CL spec, guarantees a
unique result.

For example, it says when topologically sorting the classes, you must always choose the
"minimum" element when there is only one, but when there are several minimum element you
choose the one whose which has a subclass farthest the the right in the list computed
so far.

Q2) What happens if there are two such classes which both have this right-most
element as subclass. Or why can't this ever happen?

The comment in closette.lisp claims this is "THE" standard algorithm for
topological sort. However, the standard algorithm taught to our students is
a DFS algorithm which numbers the elements as the recursion backtracks.
This is also the algorithm explained in "Introduction to Algorithms" by Thomas Cormen.

Q3) are these two algorithms equivalent? Can the DFS algorithm also use a tie-breaker
function such as the one described in the CL specification? If so, what is the "list so far"?

Barry Margolin

unread,

Jan 30, 2019, 12:40:48 PM1/30/19

to

In article <c87de0e5-44f2-4b5d...@googlegroups.com>,

Jim Newton <jimka...@gmail.com> wrote:

> I have three questions about topological sorting with respect to computing
> the
> class precedence list in CLOS.
>
> In AMOP and in the implementation of closette.lisp, there is a comment on the
> function
> std-tie-breaker-rule, which says:
> (There's a lemma that shows that this rule yields a unique result.)
>
> Q1) Which lemma is this? Does anyone know?
>
> It is not obvious to me why the tie breaker restriction in the CL spec,
> guarantees a
> unique result.
>
> For example, it says when topologically sorting the classes, you must always
> choose the
> "minimum" element when there is only one, but when there are several minimum
> element you
> choose the one whose which has a subclass farthest the the right in the list
> computed
> so far.
>
> Q2) What happens if there are two such classes which both have this
> right-most
> element as subclass. Or why can't this ever happen?

I'm just guessing, but maybe this can only happen if the class hierarchy
is inconsistent, e.g. it has a loop, which isn't allowed.

Have you tried to construct a set of classes where you run into this
ambiguity?

>
> The comment in closette.lisp claims this is "THE" standard algorithm for
> topological sort. However, the standard algorithm taught to our students is
> a DFS algorithm which numbers the elements as the recursion backtracks.
> This is also the algorithm explained in "Introduction to Algorithms" by
> Thomas Cormen.

And Wikipedia also describes DFS.

>
> Q3) are these two algorithms equivalent? Can the DFS algorithm also use a
> tie-breaker
> function such as the one described in the CL specification? If so, what is
> the "list so far"?

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

Kaz Kylheku

unread,

Jan 30, 2019, 3:20:43 PM1/30/19

to

On 2019-01-30, Jim Newton <jimka...@gmail.com> wrote:
> The comment in closette.lisp claims this is "THE" standard algorithm
> for topological sort. However, the standard algorithm taught to our
> students is a DFS algorithm which numbers the elements as the
> recursion backtracks. This is also the algorithm explained in
> "Introduction to Algorithms" by Thomas Cormen.

Let me just throw something into the pot here. The header comment
in GNU tsort says:

/* The topological sort is done according to Algorithm T (Topological
sort) in Donald E. Knuth, The Art of Computer Programming, Volume
1/Fundamental Algorithms, page 262. */

http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/tsort.c

Maybe this is different from the Cormen one.

Spiros Bousbouras

unread,

Jan 30, 2019, 7:27:42 PM1/30/19

to

On Wed, 30 Jan 2019 06:55:56 -0800 (PST)
Jim Newton <jimka...@gmail.com> wrote:
> I have three questions about topological sorting with respect to computing the
> class precedence list in CLOS.
>
> In AMOP and in the implementation of closette.lisp, there is a comment on the function
> std-tie-breaker-rule, which says:
> (There's a lemma that shows that this rule yields a unique result.)
>
> Q1) Which lemma is this? Does anyone know?

Not me. But the reasoning is too simple to count as a separate lemma unless
they have in mind something more general.

> It is not obvious to me why the tie breaker restriction in the CL spec, guarantees a
> unique result.
>
> For example, it says when topologically sorting the classes, you must always choose the
> "minimum" element when there is only one, but when there are several minimum element you
> choose the one whose which has a subclass farthest the the right in the list computed
> so far.

"Minimal" would be a better term. A partially ordered set can have at most 1
minimum element but can have several minimal ones.

> Q2) What happens if there are two such classes which both have this right-most
> element as subclass. Or why can't this ever happen?

"4.3.5.1 Topological Sorting" says "direct subclass". To show that there
can't be 2 or more , we'll conclude a contradiction. So assume that in the
course of the algorithm , the remaining set S has distinct classes T1 and T2
such that they have as direct subclass a class T which exists in the
precedence list produced so far and that , according to the remaining set R
[which is originally defined in 4.3.5] , T1 and T2 do not have a predecessor
i.e. there is no class T3 such that { <T3 , T1> is in R or <T3 , T2> is in
R }. Let C_1 , ... , C_n be the direct superclasses of T in the order they
were given in the DEFCLASS form for T. Then there are integers i and j such
that C_i = T1 and C_j = T2 .Without loss of generality we can assume that i
< j .By the way the set R is defined , we know that at the start of the
algorithm it contained the pairs <T , C_1> , ... , <C_{n-1} , C_n> . Also by
the way the set S_T is defined , we know that at the start of the algorithm
it contained all superclasses of T so in particular C_1 , ... , C_n .Since at
the present state of execution , C_j does not have a predecessor in R , at
some point in the execution <C_{j-1} , C_j> was removed from R and by the
definition of the algorithm this could only have happened if C_{j-1} was
removed from S. From this follows that even earlier in the algorithm C_{j-1}
did not have a predecessor in R so C_{j-2} was removed from S even earlier.
Proceeding inductively it follows that C_i = T1 was removed from S earlier
in the algorithm which is a contradiction.

> The comment in closette.lisp claims this is "THE" standard algorithm for
> topological sort. However, the standard algorithm taught to our students is
> a DFS algorithm which numbers the elements as the recursion backtracks.
> This is also the algorithm explained in "Introduction to Algorithms" by Thomas Cormen.
>
> Q3) are these two algorithms equivalent? Can the DFS algorithm also use a tie-breaker
> function such as the one described in the CL specification? If so, what is the "list so far"?

What "the standard algorithm" means is anyone's guess but in general there is
more than one way to extend a partial ordering to a total one. So for example
the ordering

A
/ \
/ \
B C
\ /
\ /
D

has exactly 2 extensions : D < B < C < A and D < C < B < A .So there can't
be a unique algorithm. Whether some algorithm "stands out" enough to be
called "the standard algorithm" , I don't know.

I note also that it would make it easier to reason about the way these things
work in CL if it specified in the definition of DEFCLASS that "If in the list
of superclasses of the class being defined there exist superclasses C1 and C2
such that C1 = C2 or C1 is a superclass of C2 , an error shall be signalled"
and I don't think we would lose expressive power.

--
The report on "Death Star security failures" is finally out :
<c84729c3-22a1-46c0...@googlegroups.com>

Kaz Kylheku

unread,

Jan 30, 2019, 7:56:48 PM1/30/19

to

On 2019-01-30, Jim Newton <jimka...@gmail.com> wrote:

> I have three questions about topological sorting with respect to computing the
> class precedence list in CLOS.
>
> In AMOP and in the implementation of closette.lisp, there is a comment on the function
> std-tie-breaker-rule, which says:
> (There's a lemma that shows that this rule yields a unique result.)
>
> Q1) Which lemma is this? Does anyone know?

I think the lemma is basically left beats right. (Mnemonic: LE-mma.)

1. Firstly, if you have a:

(defclass x (a b c) ())

then X is is slightly-more-of-a-new-kind-of A than it is of B, than
it is of C.

So in some situation when an X is could be interpreted as A or B
(like two competing methods), A will get it.

Demo:

[1]> (defclass a () ())
#<STANDARD-CLASS A>
[2]> (defclass b () ())
#<STANDARD-CLASS B>
[3]> (defclass x (a b) ())
#<STANDARD-CLASS X>
[4]> (defmethod meth ((arg a)) 'a)
#<STANDARD-METHOD (#<STANDARD-CLASS A>)>
[5]> (defmethod meth ((arg b)) 'b)
#<STANDARD-METHOD (#<STANDARD-CLASS B>)>
[6]> (meth (make-instance 'x))
A

2. Secondly, there is another lemma related to dispatch: if you have an
ambiguity between two generic function arguments, the left one wins.
This doesn't speak to the class precedence any more.

[7]> (defmethod twoarg ((arg1 a) (arg2 b)) 'ab)
#<STANDARD-METHOD (#<STANDARD-CLASS A> #<STANDARD-CLASS B>)>
[8]> (defmethod twoarg ((arg1 b) (arg2 a)) 'ab)
#<STANDARD-METHOD (#<STANDARD-CLASS B> #<STANDARD-CLASS A>)>
[9]> (twoarg (make-instance 'x) (make-instance 'x))
AB

Above, the left argument of the generic function call is a better match
for the first method (in this case, because an X is more of an A
than it is B, but that subtlety is not relevant here).

However, the right argument of the call is a better match
for the second method, for the same reason.

If all argument positions were equal, the call would have to be
considered ambiguous and unresolveable.

It is resolved by the principle that left arguments are favored
over right ones.

I have a question of my own:

Why does Common Lisp have these rules for silently resolving ambiguities in
the object hierarchy ... but when symbols from two used packages
have the same names, it blows up in your face with an error?

Why can't it be that if we use package A and then use B, then B
simply shadows A?

If anything, this is less error prone than the implicit rules in
the object system, which can hide subtle bugs.

C++ has a kind of "static multiple dispatch" in its resolution of
overloaded functions. In C++, the algorithm is that first the overload
candidates are identified (the function overloads that match the call at
all). Then there must be a winner among all the candidates: a function
which, for each parameter, is at least as good a match as the other
functions are, and for at least one parameter, provides a strictly
better match than all the other candidates.

Spiros Bousbouras

unread,

Jan 30, 2019, 8:35:49 PM1/30/19

to

On Thu, 31 Jan 2019 00:56:42 +0000 (UTC)
Kaz Kylheku <157-07...@kylheku.com> wrote:

[...]

> [1]> (defclass a () ())
> #<STANDARD-CLASS A>
> [2]> (defclass b () ())
> #<STANDARD-CLASS B>
> [3]> (defclass x (a b) ())
> #<STANDARD-CLASS X>
> [4]> (defmethod meth ((arg a)) 'a)
> #<STANDARD-METHOD (#<STANDARD-CLASS A>)>
> [5]> (defmethod meth ((arg b)) 'b)
> #<STANDARD-METHOD (#<STANDARD-CLASS B>)>
> [6]> (meth (make-instance 'x))
> A
>
> 2. Secondly, there is another lemma related to dispatch: if you have an
> ambiguity between two generic function arguments, the left one wins.
> This doesn't speak to the class precedence any more.
>
> [7]> (defmethod twoarg ((arg1 a) (arg2 b)) 'ab)
> #<STANDARD-METHOD (#<STANDARD-CLASS A> #<STANDARD-CLASS B>)>
> [8]> (defmethod twoarg ((arg1 b) (arg2 a)) 'ab)

I think you mean 'ba .

> #<STANDARD-METHOD (#<STANDARD-CLASS B> #<STANDARD-CLASS A>)>
> [9]> (twoarg (make-instance 'x) (make-instance 'x))
> AB
>
> Above, the left argument of the generic function call is a better match
> for the first method (in this case, because an X is more of an A
> than it is B, but that subtlety is not relevant here).
>
> However, the right argument of the call is a better match
> for the second method, for the same reason.
>
> If all argument positions were equal, the call would have to be
> considered ambiguous and unresolveable.
>
> It is resolved by the principle that left arguments are favored
> over right ones.
>
> I have a question of my own:
>
> Why does Common Lisp have these rules for silently resolving ambiguities in
> the object hierarchy ... but when symbols from two used packages
> have the same names, it blows up in your face with an error?

I imagine because experience has shown that multiple inheritance is useful
and that it would lose some of its usefulness if the class hierarchy was not
allowed to have "diamonds".

> Why can't it be that if we use package A and then use B, then B
> simply shadows A?

What would be the use of that ?

> If anything, this is less error prone than the implicit rules in
> the object system, which can hide subtle bugs.

If one designs a class hierarchy , he must have a use for it and if he has
a use for it then he can test it and catch those bugs.

Jim Newton

unread,

Jan 31, 2019, 3:49:05 AM1/31/19

to

> Let me just throw something into the pot here. The header comment
> in GNU tsort says:

> http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/tsort.c
>

Wow! that's a lot of code.

Jim Newton

unread,

Jan 31, 2019, 3:53:11 AM1/31/19

to

On Thursday, January 31, 2019 at 1:27:42 AM UTC+1, Spiros Bousbouras wrote:
> "4.3.5.1 Topological Sorting" says "direct subclass". To show that there
> can't be 2 or more , we'll conclude a contradiction. So assume that in the
> course of the algorithm , the remaining set S has distinct classes T1 and T2
> such that they have as direct subclass a class T which exists in the
> precedence list produced so far and that , according to the remaining set R
> [which is originally defined in 4.3.5] , T1 and T2 do not have a predecessor
> i.e. there is no class T3 such that { <T3 , T1> is in R or <T3 , T2> is in
> R }. Let C_1 , ... , C_n be the direct superclasses of T in the order they
> were given in the DEFCLASS form for T. Then there are integers i and j such
> that C_i = T1 and C_j = T2 .Without loss of generality we can assume that i
> < j .By the way the set R is defined , we know that at the start of the
> algorithm it contained the pairs <T , C_1> , ... , <C_{n-1} , C_n> . Also by
> the way the set S_T is defined , we know that at the start of the algorithm
> it contained all superclasses of T so in particular C_1 , ... , C_n .Since at
> the present state of execution , C_j does not have a predecessor in R , at
> some point in the execution <C_{j-1} , C_j> was removed from R and by the
> definition of the algorithm this could only have happened if C_{j-1} was
> removed from S. From this follows that even earlier in the algorithm C_{j-1}
> did not have a predecessor in R so C_{j-2} was removed from S even earlier.
> Proceeding inductively it follows that C_i = T1 was removed from S earlier
> in the algorithm which is a contradiction.
>

Thanks for the sketch of the proof. I wish comp.lang.lisp had a more expressive
typesetting concept which supports math notation. Anyway, I'll take a look at
your proof, and try to convince myself. Thanks again.

Kaz Kylheku

unread,

Jan 31, 2019, 12:04:29 PM1/31/19

to

Wut; it's just 550 lines of non-obfuscated C for the entire utility,
main() and all; it includes its own implementaton AVL tree insertion
and searching.

Spiros Bousbouras

unread,

Jan 31, 2019, 1:30:34 PM1/31/19

to

On Thu, 31 Jan 2019 00:53:07 -0800 (PST)
Jim Newton <jimka...@gmail.com> wrote:
> I wish comp.lang.lisp had a more expressive typesetting concept which
> supports math notation.

Actually usenet posts are just an envelope and one can put in one anything
one wants including attachments [although how many newsreaders can handle
attachments is a different story]. So one could have a piece of mathematics
as a PDF attachment and post it on comp.lang.lisp .Or one could have LaTeX
source , perhaps with an appropriate header [usenet standards allow one to
create one's own header fields beyond the standard ones] and a sufficiently
"clued up" newsreader could extract the source , invoke LaTeX to compile it
and present the result to the user. So basically usenet standards already
support math notation , it's the software which hasn't kept up [as far as I
know , I haven't really searched]. Anyway , my earlier post does use LaTeX
notation although it doesn't have everything required to be compiled as LaTeX
source. So for example "C_{j-1}" means "C" with subindex "j-1" .

--
C++ is sometimes called the language that will eventually turn even the
simplest of things into complex, expert-only beasts.
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1300r0.pdf

paul wallich

unread,

Jan 31, 2019, 2:19:27 PM1/31/19

to

On 1/30/19 7:56 PM, Kaz Kylheku wrote:
[...]

> I have a question of my own:
>
> Why does Common Lisp have these rules for silently resolving ambiguities in
> the object hierarchy ... but when symbols from two used packages
> have the same names, it blows up in your face with an error?
>
> Why can't it be that if we use package A and then use B, then B
> simply shadows A?
>
> If anything, this is less error prone than the implicit rules in
> the object system, which can hide subtle bugs.

I'm not sure it is less error prone, unless you restrict yourself to
only one level of using. Once you start loading libraries that depend on
other libraries things can get opaque. Unless you're specifically
looking for conflicts, you could do a refactor that changes the order a
few levels in, and suddenly the shadowing would be reversed. Or a new
version of a library might use something it formerly didn't (or not use
something it formerly did) and suddenly your shadowing would change.

Sure, you could just read all the sources and keep track of everything
all of your dependencies export (and arrange your code to get the right
shadowing effects) but that seems like a recipe for subtle errors.

paul

Kaz Kylheku

unread,

Jan 31, 2019, 5:37:06 PM1/31/19

to

On 2019-01-31, paul wallich <p...@panix.com> wrote:
> On 1/30/19 7:56 PM, Kaz Kylheku wrote:
> [...]
>> I have a question of my own:
>>
>> Why does Common Lisp have these rules for silently resolving ambiguities in
>> the object hierarchy ... but when symbols from two used packages
>> have the same names, it blows up in your face with an error?
>>
>> Why can't it be that if we use package A and then use B, then B
>> simply shadows A?
>>
>> If anything, this is less error prone than the implicit rules in
>> the object system, which can hide subtle bugs.
>
> I'm not sure it is less error prone, unless you restrict yourself to
> only one level of using. Once you start loading libraries that depend on
> other libraries things can get opaque.

Not really; package use is not inherited!

If A uses B, and B uses C, then inside package B, we can use just FOO
for C:FOO. But inside A, FOO doesn't resolve to C:FOO, and neither does
B:FOO.

Quick demo with CLISP.

We are relying on the knowledge that CL-USER uses CL; when we are in
CL-USER, we can obviously use LIST.

But if we switch to package EXT, we can see that CL-USER:LIST ignores
the CL-USER package's use-list; it fails to look up:

[1]> (in-package :ext)
#<PACKAGE EXT>
EXT[2]> (cl-user:list 1 2 3)

*** - READ from
#<INPUT CONCATENATED-STREAM #<INPUT STRING-INPUT-STREAM>
#<IO TERMINAL-STREAM>>
: #<PACKAGE COMMON-LISP-USER> has no external symbol with name "LIST"
The following restarts are available:
ABORT :R1 Abort main loop

So really, package use is quite nicely localized; a conflict resolution
strategy based on order of use seems fairly inocuous.

The risk is that if symbols from "use A" are hidden by "use B", and package A
is upgraded so that it exports new symbols, those new symbols (which the
programmer wants to use) may be already hidden by existing B symbols.

Basically, there *is* a conflict there, even though undiagnosed, and it has to
be resolved somehow. Even after being resolved, it potentially re-appears in a
new manifestation when the packages are updated to new versions, yet
without a diagnostic to alert to the problem.

The same kind of could happen with a multiple inheritance object frame work and
ambiguous method calls though.

You fetch the new framework where some new inheritances and methods are added,
and silently, your calls are doing things you don't intend.

Your FOO, which you thought was a BAR, is now considered slightly more of a
XYZZY, and so your METHOD isn't called, oops!

Rob Warnock

unread,

Feb 1, 2019, 2:01:15 AM2/1/19

to

Kaz Kylheku <157-07...@kylheku.com> wrote:
+---------------

| Not really; package use is not inherited!
|
| If A uses B, and B uses C, then inside package B, we can use
| just FOO for C:FOO. But inside A, FOO doesn't resolve to C:FOO,
| and neither does B:FOO.

+---------------

Quite correct. Though note that B::FOO *does* resolve to C:FOO
[not that one should rely on internal symbols of another package ;-} ].

Which is why you sometimes see this pattern in B:

(use-package "C")
(export 'foo 'bar ...)

[Or, more commonly, with the equivalent :USE & :EXPORT sub-directives
in the DEFPACKAGE for B.]

After that, B:FOO *does* resolve to C:FOO.

Ugly, yes, but sometimes done in "utility" packages which provide
a thin wrapper around one or more other packages.

-Rob

-----
Rob Warnock <rp...@rpw3.org>
627 26th Avenue <http://rpw3.org/>
San Mateo, CA 94403

Barry Margolin

unread,

Feb 1, 2019, 11:20:47 AM2/1/19

to

In article <201901301...@kylheku.com>,

Kaz Kylheku <157-07...@kylheku.com> wrote:

> Why does Common Lisp have these rules for silently resolving ambiguities in
> the object hierarchy ... but when symbols from two used packages
> have the same names, it blows up in your face with an error?

CLOS has method combination and CALL-NEXT-METHOD, which allows
same-named methods inherited from multiple superclasses to be merged in
various ways. There's nothing analogous when packages have conflicts.