Serious problem with underive (for hierarchies), an attempt to fix, and request for code review.

17 views
Skip to first unread message

Rob Lachlan

unread,
Jun 15, 2010, 7:24:47 PM6/15/10
to Clojure
I think that the underive function for removing hierarchy
relationships between keywords is broken. I'll illustrate with an
example and describe what I think the problems are. I've got some
code for a function which (I hope!) performs underive correctly, and
I'd love it if people had a look.

Consider a hierarchy with child-parent relationships:

:c -> :p1, :c -> :p2, :p1 -> :a1, :p1 -> :a2, :p2 -> :a2

Which I'll illustrate with the world's worst ascii art:

a1 a2
| / |
| / |
| / |
p1 p2
| /
| /
| /
c

; creating this hierarchy:
user> (def h (reduce #(apply derive (cons %1 %2)) (make-hierarchy)
[[:p1 :a1] [:p1 :a2] [:p2 :a2] [:c :p2] [:c :p1]]))
#'user/h
user> h
{:parents {:c #{:p1 :p2}, :p2 #{:a2}, :p1 #{:a2 :a1}}, :ancestors {:c
#{:p1 :p2 :a2 :a1}, :p2 #{:a2}, :p1 #{:a2 :a1}}, :descendants {:p1
#{:c}, :p2 #{:c}, :a2 #{:p1 :p2 :c}, :a1 #{:p1 :c}}}



Now the underive:

user> (underive h :c :p1)
{:parent {:c #{:p2}, :p2 #{:a2}, :p1 #{:a2 :a1}}, :ancestors {:c
#{:p2}, :p2 #{:a2}, :p1 #{:a2 :a1}}, :descendants {:p1 #{}, :p2
#{:c}, :a2 #{:p1 :p2}, :a1 #{:p1}}}


Problems:

Most seriously, it is incorrect: :c should still show up as a
descendant of
:a2, since :c is a child of :p2, and :p2 a child of :a2. Note that
the
parent map is correct, so it is the ancestor and descendant maps which
are
inconsistent.

Also, notice that the key for the parent map has changed from :parents
to
:parent, which causes calls to the parents function to return nil.

Also, keys which map to empty sets are left in each of the maps
(parents,
descendants and ancestors), with the result that equality tests could
fail
in some cases in which they would be true.


Potential fix, and request for code review:

I've written an underive-new function which appears to fix these
problems,
at least in the tests that I've done.

http://pastebin.com/eRiz5ihw

My approach is to identify the sets of tags in the hierarchy whose
ancestors
or descendants may be affected, remove their ancestor or descendant
mappings,
then rebuild them. In a call (underive h child parent), the child and
all
the child's descendants will need to have their ancestor mappings
fixed.
Similarly, the parent and all of the parent's ancestors will need to
have
their descendant mappings rebuilt.

(As an aside, the problem here is that our hierarchy can be a DAG
rather
a tree, because multiple inheritance is allowed. And therefore, for
any
ancestor/descendant relationship it's comparatively expensive to
determine
whether a particular parent-child edge is essential or not.)

Next, the two sets of interest (child + child's descendants, and
parent +
parent's ancestors) are sorted topologically. For the child +
child's
descendants, we can now rebuild the ancestor mappings by taking a tag
from
the top of the set, setting its ancestors to the union of tag's
parents and
parents' ancestors, and repeating the process on the next tag in
topologically
sorted order until all tags are consumed. The parent + parent's
ancestors
can be done in a similar way, except that a children function is
required, and
the tags must be topologically sorted in the reverse direction.


Some remarks:

An alternative approach would be to just recompute the whole ancestor
and
descendant maps for the whole hierarchy. This would be simpler, but
for
large hierarchies would be unnecessarily expensive.

The topological sort function which I wrote avoids recursion. If we
wanted
to allow multiple recursion, it could be simplified a bit, but I
wanted to
avoid stack consumption.

Anyway, I'd appreciate any suggestions and criticisms, since this is
the
first clojure code of mine that anyone has looked at.

Rob Lachlan

unread,
Jun 15, 2010, 7:29:42 PM6/15/10
to Clojure
Oh God. What broken formatting. Sorry about that.

Rob Lachlan

unread,
Jun 15, 2010, 8:06:29 PM6/15/10
to Clojure
As an alternative, if we're willing to rebuild the entire hierarchy,
not just the subgraph that's affected by the underivation,
we can just do this:

(defn easy-underive [h child parent]
(let [oldParents (:parents h)
childsParents (disj (clojure.set/union #{} (child oldParents))
parent)
newParents (if (not-empty childsParents)
(assoc oldParents child childsParents)
(dissoc oldParents child))
derivation-seq (flatten (map #(cons (key %) (interpose (key %) (val
%)))
(seq newParents)))]
(reduce #(apply derive (cons %1 %2)) (make-hierarchy)
(partition 2 derivation-seq))))

It's less efficient, because it tosses out all of the information in
the ancestor and descendant sets. But it is a good deal simpler -- 10
lines of code instead of 60 or so. For small hierarchies this would
be fine. If anyone were to make large
hierarchies which had to be modified efficiently, though, I think
something like in the first message would be required.

Stuart Halloway

unread,
Jun 16, 2010, 10:54:37 AM6/16/10
to clo...@googlegroups.com
Hi Rob,

Thanks for tracking this down. If you will execute a CA [1], I would love to get a patch (with tests) that fixes this. I have created a ticket at [2] to track it.

I would prefer something along the lines of the simpler fix shown below, unless anybody pops up on this thread with a real-world use case for performance-critical underive.

Stu

[1] http://clojure.org/contributing
[2] https://www.assembla.com/spaces/clojure/tickets/382-fix-underive-for-multiple-inheritance

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

YD

unread,
Jun 16, 2010, 1:15:28 AM6/16/10
to Clojure
The problems are real. Well done!

YD

unread,
Jun 16, 2010, 1:39:06 AM6/16/10
to Clojure
Another apporach I think would be modifying the data structure of
hierarchy itself. The idea is to add a counter to ancestors and
descendants.
:ancestors: { :c #{ [:a1 1] [:a2 2] } }
So, the counter 2 on :a2 means how many paths can you get :c to
reach :a2. When the counter reaches 0, you can remove it completely.
Otherwise, decrement it.

This approach requires to change all the functions related to
hierarchy, and the data structure will be incompatible to the original
one. But the speed is gonna be fast.

Rob Lachlan

unread,
Jun 16, 2010, 6:10:01 PM6/16/10
to Clojure
Hi Stuart,

I'll be mailing the agreement later today, and I'll work on a patch
shortly. I've noticed that the unit tests file for multi-methods,
where tests for derive and underive would be located, is essentially
empty. So I thought I might put some tests in there for isa, parents,
ancestors, etc. while I'm about it.

I take your point on preferring a simpler version. I have a feeling
that:

-most people are using the global hierarchy
-most people are using the hierarchy essentially statically

which is a bit of a shame, since the hierarchy system is quite
powerful. With independent hierarchies, you can confidently juggle
several hierarchy relationships, and dynamically update them. I think
their uses go beyond multi-methods.

But for hierarchies which are largely static, and which will rarely
underive, a simpler underive in core will be good. I'll probably put
my complicated version together with some other convenience functions
for hierarchies in a library of my own, in case anyone has a need for
it.

Rob

p.s. Loved the book.


On Jun 16, 7:54 am, Stuart Halloway <stuart.hallo...@gmail.com> wrote:
> Hi Rob,
>
> Thanks for tracking this down. If you will execute a CA [1], I would love to get a patch (with tests) that fixes this. I have created a ticket at [2] to track it.
>
> I would prefer something along the lines of the simpler fix shown below, unless anybody pops up on this thread with a real-world use case for performance-critical underive.
>
> Stu
>
> [1]http://clojure.org/contributing
> [2]https://www.assembla.com/spaces/clojure/tickets/382-fix-underive-for-...

Rob Lachlan

unread,
Jun 16, 2010, 6:30:20 PM6/16/10
to Clojure
I like the reference counting idea, YD. I don't think that we want to
go that route, though, if underive will be called comparatively rarely
by most people. But it would be a good way to do it if that
performance were needed.

Rob
Reply all
Reply to author
Forward
0 new messages