Otherwise, I suppose nothing stops you from just creating trees via a
Multimap<Node, Node>, or a Map<Node, Pair<Node, Node>>, or whatever.
Dimitris
2010/1/14 JP <james.aus...@gmail.com>:
> --
> guava-...@googlegroups.com.
> http://groups.google.com/group/guava-discuss?hl=en
> unsubscribe: guava-discus...@googlegroups.com
>
> This list is for discussion; for help, post to Stack Overflow instead:
> http://stackoverflow.com/questions/ask
> Use the tag "guava".
>
>
Dimitris
2010/1/14 Tim Harsch <harsc...@gmail.com>:
Graph<T> extends Multimap<T, T>
and then implement all the usual graph and tree operations on it
(finding cycles, minimum path between nodes, and so forth).
--
Craig Berry
Software Engineer
Google Santa Monica - Second Street
Internal blog: http://big.corp.google.com/~cberry/musings/
By the way, since we obviously are in nitpicking mode, if someone does
prevent a source (the root) from being a destination, this condition
is still not sufficient to guarantee that a multimap represents a
tree. Otherwise, if you wrote "a source" in the sense "any possible
source"/"all sources", then given that condition, such a multimap
cannot model trees with depth>2, not that useful. :-)
2010/1/14 Tim Harsch <harsc...@gmail.com>:
On Thu, Jan 14, 2010 at 3:20 PM, Tim Harsch <harsc...@gmail.com> wrote:
> Actually I didn't imply "every multimap is a tree". I implied that a graph
> was possible from a Multimap if you didn't disallow it from being one, and
> therefore pointed out what lead me to think of graphs. I claimed you
> 'mentioned graphs' only as a joke, thus the little winky guy. :-)
--
Craig Berry
Software Engineer
Google (Santa Monica CA)
On Jan 14, 5:02 pm, Craig Berry <cbe...@google.com> wrote:
> Interesting. It would be cool to create a subclass
>
> Graph<T> extends Multimap<T, T>
>
> and then implement all the usual graph and tree operations on it
> (finding cycles, minimum path between nodes, and so forth).
>
>
>
> On Thu, Jan 14, 2010 at 1:41 PM, Tim Harsch <harschw...@gmail.com> wrote:
> > Actually you did ;-) If you don't prevent a source from being a destination
> > in a Multimap<Node, Node> you have a graph.
>
> > On Thu, Jan 14, 2010 at 10:48 AM, Dimitris Andreou <jim.andr...@gmail.com>
> > wrote:
>
> >> If you make a concrete proposal on what that "stab at trees" could
> >> look like, we could discuss upon that. (Btw, I didn't mention graphs).
>
> >> Dimitris
>
> >> 2010/1/14 Tim Harsch <harschw...@gmail.com>:
> >> > I disagree, I've always wondered why the collections framework doesn't
> >> > take
> >> > a stab at trees. Trees probably have far fewer types, and variations
> >> > than
> >> > graphs. Graphs probably don't belong in a collection framework, because
> >> > that framework could get quite large. It may be possible to make Trees
> >> > fairly generic and cover the scope of it fairly completely. I would
> >> > think
> >> > it would require some careful design though...
>
> >> > Anyway, I think it is a fundamental data structure that lacks attention.
> >> > I
> >> > recall making a search sometime back and the best I could find was
> >> > something
> >> > in the Swing framework, JTree maybe(?), anyway for obvious reasons I
> >> > didn't
> >> > like the idea of using it as a generic collection ...
>
> >> > On Thu, Jan 14, 2010 at 9:21 AM, Dimitris Andreou
> >> > <jim.andr...@gmail.com>
> >> > wrote:
>
> >> >> I think it's too general a domain. It's like asking for "a framework
> >> >> for developing any kind of linked node". If you need something
> >> >> specialized, I don't believe any framework will offer real value (and
> >> >> performance) over just directly creating the structure you need.
>
> >> >> Otherwise, I suppose nothing stops you from just creating trees via a
> >> >> Multimap<Node, Node>, or a Map<Node, Pair<Node, Node>>, or whatever.
>
> >> >> Dimitris
>
> >> >> 2010/1/14 JP <james.austin.pe...@gmail.com>:
It's easy to develop a data structure that solves your problem, or
perhaps even a significant class of problems. It's not so easy,
though, to solve 95% of use cases. I've found this to be especially
true with graphs.
Take MultiMap for example. That works if you only ever traverse in one
direction. I suppose you could model a graph by delegating to two
MuiltiMaps (or three if you wanted to allow both directed and
undirected edges), but then you really wouldn't want to extend
MultiMap. Or maybe a BiMap + MultiMap variant would work better.
But then, what if the user wants weighted edges, or to put some other
first-class object on them?
I also find that I need to modify graphs while traversing over them.
For example, say you're performing a depth-first traversal from A
(forgive the ASCII representation):
A --> {B, C, D}
B --> {D}
A (pre-order) traversal would normally return elements in this order:
A, B, D, C, D
What happens if I remove D the first time I encounter it (through B--
>D)? If using an ArrayList internally, I'll get a concurrent mod
exception when later advancing A's iterator. Or, if I use a snapshot
iterator to avoid the CME, then I'll see D later even though I already
removed it.
Or, maybe your data structure is observable. Then you run into the
problem of a listener modifying the data structure that they're
listening to, another common use case for my particular problem space.
And then again, you might want something truly concurrent, which is
much bigger problem.
Short answer is, general purpose graphs are tricky to implement.
Having said that, an explicit tree (or forest) type should be a lot
simpler. I would strongly recommend that tree and graph proposals be
kept distinct. I think it makes more sense to unify them through views
or wrappers than to have Tree extend Graph or something similar.
- Ray Conner
On Jan 14, 6:47 pm, Tim Harsch <harschw...@gmail.com> wrote:
> Right, and does... This is exactly what I do with multimaps. Though, it
> doesn't provide all the goodies that you mentioned like cycle detection and
> so forth... so be careful iterating. I guess 100% CPU utilization is as
> fair an error indicator as anything else ;-)
>
> On Thu, Jan 14, 2010 at 3:32 PM, Craig Berry <cbe...@google.com> wrote:
> > But every Multimap<T, T> can be construed as a graph, which might prove
> > useful.
>
> > On Thu, Jan 14, 2010 at 3:20 PM, Tim Harsch <harschw...@gmail.com> wrote:
> > > Actually I didn't imply "every multimap is a tree". I implied that a
> > graph
> > > was possible from a Multimap if you didn't disallow it from being one,
> > and
> > > therefore pointed out what lead me to think of graphs. I claimed you
> > > 'mentioned graphs' only as a joke, thus the little winky guy. :-)
>
> > --
> > Craig Berry
> > Software Engineer
> > Google (Santa Monica CA)
>
> > --
> > guava-...@googlegroups.com.
> >http://groups.google.com/group/guava-discuss?hl=en
> > unsubscribe: guava-discus...@googlegroups.com<guava-discuss%2Bunsu...@googlegroups.com>
I have had a look at the APIs for JUNG and Jgrapht and I believe the
API should be more simpler. To be able to be comfortable with their
API it imposes the developer should have a solid foundation of graph
theory, which the "average" developer usually doesn't possess.
Best,
James.
I will submit a proposal of how I envision the Tree/Graph API over the
weekend with implementations.
I will also write test cases to test and
analyse its behaviour and complexity. I agree with Ray that trees and
graphs should be kept distinct and I have this in mind.
I have had a look at the APIs for JUNG and Jgrapht and I believe the
API should be more simpler. To be able to be comfortable with their
API it imposes the developer should have a solid foundation of graph
theory, which the "average" developer usually doesn't possess.
Best,
James.
This list is for discussion; for help, post to Stack Overflow instead:
http://stackoverflow.com/questions/ask
Use the tag "guava".
As he mentioned, I am one of the people that created JUNG (as well as
a software engineer at Google, although that came a lot later).
Originally JUNG did not include Tree as a separate type, but as of
JUNG 2.0 it does.
You can certainly argue that JUNG's Tree is more full-featured than it
needs to be to represent many operations on a tree. Similarly, there
are all kinds of operations on, say, Map that I never use. This is
why things like AbstractMap--and AbstractGraph--exist: to enable
simple implementations, if you happen to not like any of the
implementations that JUNG provides.
That said, we--the JUNG team, that is--are always looking for
suggestions and feedback. I am, for instance, happy to entertain
suggestions for creating a simpler interface for tree implementations
(perhaps as a superinterface of the existing Tree interface, as I wish
that Guava's Function were of Map). While I hope to avoid major
restructurings at this point, JUNG is still very much a work in
progress.
Joshua
On Jan 24, 8:45 pm, Tim Harsch <harschw...@gmail.com> wrote:
> I'm only one user of the collections so I know it doesn't mean much for me
> to weigh in but hopefully some feedback can help guide you all in some way.
> Trees in my view, are very collection-like and can be simply modeled. It is
> probably quite do-able to tackle the full scope of them and their sub types
> to a large degree without immense effort (not sure same can be said for
> graphs), so I'm guessing it would not be an endless rabbit hole. I agree
> with James, in looking at Jung it seems too weighty of a Tree
> implementation. I'm not sure I agree with James when he says he would like
> to "propose what he envisions as a "Tree/Graph API"", which implies graphs
> should be added too. Perhaps graphs would be better to come along much
> later. But a nice tree collection would, in my opinion be a welcomed
> addition, and bless your team if you decide to take up that charge.
>
> PS I noticed Jakarta has TreeList, which uses a Tree internally which I
> think is not exposed as part of the public API. Looking a bit more it seems
> they have a few collections which then use TreeList under the hood. Not
> sure if Guava has plans for any similar collections, but a Guava Tree API
> may serve well to help implement some future non-tree collections.
>
> Just food for thought, and good luck with it and the rest of Guava no matter
> which way you go.
> Tim
>
> On Wed, Jan 20, 2010 at 7:39 AM, Kevin Bourrillion <kev...@google.com>wrote:
>
> > On Wed, Jan 20, 2010 at 6:08 AM, James Perry <james.austin.pe...@gmail.com
> > > wrote:
>
> >> I will submit a proposal of how I envision the Tree/Graph API over the
> >> weekend with implementations.
>
> > Note that even in the face of JUNG and Jgrapht's imperfections, whatever
> > they may be, I'm not inclined for Guava to start overlapping with them. I
> > see JUNG in particular as a library that gets the job done. Of course, I'm
> > biased by having a good working relationship with its maintainer Joshua
> > O'Madadhain, a Googler who has helped here and there with Guava!
>
> >> I will also write test cases to test and
> >> analyse its behaviour and complexity. I agree with Ray that trees and
> >> graphs should be kept distinct and I have this in mind.
>
> >> I have had a look at the APIs for JUNG and Jgrapht and I believe the
> >> API should be more simpler. To be able to be comfortable with their
> >> API it imposes the developer should have a solid foundation of graph
> >> theory, which the "average" developer usually doesn't possess.
>
> >> Best,
> >> James.
>
> >> --
> >> guava-...@googlegroups.com.
> >>http://groups.google.com/group/guava-discuss?hl=en
> >> unsubscribe: guava-discus...@googlegroups.com<guava-discuss%2Bunsu...@googlegroups.com>
>
> >> This list is for discussion; for help, post to Stack Overflow instead:
> >>http://stackoverflow.com/questions/ask
> >> Use the tag "guava".
>
> > --
> > Kevin Bourrillion @ Google
> > internal: http://go/javalibraries
> > external: guava-libraries.googlecode.com
>
> > --
> > guava-...@googlegroups.com.
> >http://groups.google.com/group/guava-discuss?hl=en
> > unsubscribe: guava-discus...@googlegroups.com<guava-discuss%2Bunsu...@googlegroups.com>
The Tree API has root interface, Tree<T>, and has the following
subinterfaces: ListTree<T>, SetTree<T> and SortedSetTree<T>. The data
structure behind the hood is a Map<T, Collection<T>> so it essentially
like Multimap but with some subtleties (different decorator behaviour
for returned collections, manage which keys are the root, etc.). A
parent node is stored as a key and its children are stored as the
value. These subinterfaces have implementations which are, so far,
ArrayListTree<T>, LinkedListTree<T>, HashSetTree<T> and TreeMapTree<T>
(I need to think of a better name for TreeMapTree!). An
ArrayListTree<T> and LinkedListTree<T> will allow duplicate elements
as children, HashSetTree<T> doesn't allow duplicate elements as
children and TreeMapTree<T> is the same as HashSetTree<T> but the
children are sorted. The API will offer immutable versions of these
implementations just like Guava data structures.
To create these structures is also very similar to how you would do it
in Guava. For example:
SetTree<Employee> hierarchy = HashSetTree<Employee>.create();
hierarchy.put(ceo, ImmutableList.of(salesDirector, techDirector, hrDirector);
hierarchy.put(techDirector, architect);
boolean hasChildren = hierarchy.hasChildren(ceo);
boolean hasParent = hierarchy.hasParent(architect);
I am still on the drawing board on the best way to accommodate
traversal needs where the developer can write a bespoke one or use an
existing one. I am thinking on the lines to pass in a Function<T,
Void> argument.
Cheers,
James.
James.
Once you specify an interface, it's relatively straightforward to
provide implementing classes through delegation to or building upon
various types of collections.
--
I also agree with Tim that trees and graphs are very collection-like.
This has influenced the design of the core JUNG interfaces and classes
quite a lot.
Regarding James' proposal:
* First, don't worry about my toes; internal discussions of the JUNG
team were once far more, um, spirited than anything anyone here is
likely to say about JUNG. Besides, feedback is always good. Bombs
away!
* Second, it's fine as far as it goes, but I think that the addressing
the question of what methods should be included is critical to getting
traction--more so, IMO, than figuring out the type hierarchy. (The
type hierarchy will sort of fall out once you start thinking about
methods/operations, in my experience.)
I encourage anyone that wants to come up with a Tree type to look at
how JUNG did it and throw darts at it. Yes, there's a lot of stuff in
Tree that it inherits from Graph that need not be included if you're
focusing solely on trees. So what do you want to throw out? Be
specific! Be brutal! A lot of thought went into its design and I
think that it should be useful as a point of departure, at least.
Joshua
PS: For what it's worth, when I attended a workshop on scientific data
visualization last year--which included some very opinionated
attendees that had come up with their own graph/network
interfaces--what we came out with as a proposed common interface was
basically the JUNG interface with a couple of minor name changes.
--
Joshua O'Madadhain: Information Scientist, Musician, Philosopher-At-Tall
"It's that moment of dawning comprehension that I live for" -- Bill Watterson
Anyway, I think the excellent JDSL library ought to be mentioned in
this discussion - though it clearly shows its age (no 1.5 features),
it is still implementing full-featured textbook data structures
without sacrificing performance as most attempts to graphs seem to do
(last time I checked, some 4 years ago, I recall it was like 4 times
faster than JUNG in graphs/trees).
Dimitris
2010/1/27 Joshua O'Madadhain <jr...@google.com>:
To make it fully configurable, I picture a Graph.visit method that
takes a starting node, a visitation strategy, and a visitor. The
strategy determines how to traverse the tree, and presents nodes to
the visitor. So you could pass in strategies for e.g. depth- or
breadth-first, leaves-only, or the like.
Hypergraphs are useful in any context in which you're dealing with
collections of groups, as opposed to collections of dyadic (directed
or undirected) relationships. I used them in the context of doing
research on coauthorship networks, for example.
There is a 1-1 mapping from hypergraphs to bipartite graphs, of
course, but hypergraphs are more efficient in terms of the number of
objects required and also easier conceptually (otherwise you have to
treat the two types of vertex differently).
> Anyway, I think the excellent JDSL library ought to be mentioned in
> this discussion - though it clearly shows its age (no 1.5 features),
> it is still implementing full-featured textbook data structures
> without sacrificing performance as most attempts to graphs seem to do
> (last time I checked, some 4 years ago, I recall it was like 4 times
> faster than JUNG in graphs/trees).
There was a big shift in JUNG as of the 2.0 release, which made things
considerably simpler and smaller (and hopefully faster). That said,
JUNG does need better data structures for visualization in particular,
and would no doubt benefit from performance tuning in some of its
internals. (I'd be curious to know what in particular you were
comparing.)
I don't insist that JUNG is the standard by which all other Java graph
libraries must be judged. I do think that it's helpful to have a
concrete proposal, and to compare it to other existing solutions
(JUNG, JGraphT, JDSL, whatever).
Joshua
http://139.91.183.8:3026/hudson/job/flexigraph/javadoc/gr/forth/ics/graph/path/package-summary.html
Class Traverser. Note that visiting is not a call-back, but
iterator-based (it is just easier to use local vars, as well as
break/continue/return). It depends heavily on "Path", an immutable
"rope" data structure (i.e. cheap O(1) append and prepend), so that at
each point, you don't only have the node you are at, but you can find
the whole path from the node you started. Also dfs/bfs/custom order
are supported - though these are no substitutes for the full-fledged
dfs/bfs algorithms.
Dimitris
2010/1/27 Craig Berry <cbe...@google.com>:
You can always create an implementation--or a subtype--that abstracts
away the edge types (and direct edge access, and all that). The other
methods are "in the way" in much the same way that (say) most of the
Object methods are "in the way" most of the time.
> Also, I see a Forest as a collection of trees. Not sure why JUNG makes a
> Tree extend from a Forest which also provides a Collection of Trees... I
> kind-of don't get that...
A Forest is a constrained Graph (which happens to be composed of a
collection of Trees); a Tree is a constrained Forest. (In JUNG 1.x we
considered Graph to be a constrained Hypergraph, which is true in a
sense but somewhat confusing for those unfamiliar with hypergraphs, so
we simplified the hierarchy in v. 2.)
> Also, since visiting each node of a tree (or graph as well) is not as
> straight forward as foreach loop, I would to see an event based approach...
> perhaps using a Visitor pattern, or closures, or some such... the key is
> that the user doesn't decide how to iterate the tree but what to do with
> each node (or filtered node) when it is encountered.
This is all perfectly possible with the JUNG types. This sort of
traversal wasn't much in demand by our users, so we just never got
around to providing one. If anyone would like to contribute one, that
would be great.
Joshua
You get preorder, postorder and inorder tree traversals. It's not a
biggie but it should be taken into account. I solved such a problem
using a visitor decorator within my own Tree / Graph implementation
within the open source project I work on. The current tree's are
simple, but they work for the moment.
Dimitris
2010/1/27 Luke Nezda <lne...@gmail.com>:
Tree.java: http://tinyurl.com/yayfqhr
ListTree.java: http://tinyurl.com/ye9quju
Best,
James.
SetTree.java
public interface SetTree<T> extends Tree<T> {
Set<T> leaves();
Set<T> root();
Set<T> children(T parent);
SetTree<T> graft(@Nullable SetTree<T> tree, @Nullable T parent);
SetTree<T> prune(@Nullable T parent);
int hashCode();
boolean equals(@Nullable Object obj);
}
SortedSetTree.java
public interface SortedSetTree<T> extends SetTree<T> {
SortedSet<T> leaves();
SortedSet<T> root();
SortedSet<T> children(T parent);
SortedSetTree<T> graft(@Nullable SortedSetTree<T> tree, @Nullable T parent);
SortedSetTree<T> prune(@Nullable T parent);
int hashCode();
boolean equals(@Nullable Object obj);
}
Traverse.java
public interface Traverse<T> {
void traverse(Function<T, ?> function);
}
2010/1/27 James Perry <james.aus...@gmail.com>:
--
Best,
James.
Thanks. I have a few comments. :)
I would avoid creating the Set, SortedSet, etc. interfaces; they don't
really add much for the overhead. Here are a couple of options:
(1) Tree<T, S extends Collection<T>> [or maybe even 'S extends Iterable<T>']
and then your methods look like this:
S leaves();
S children(T parent);
and so on. This means that you can create implementations based on
whatever implementation of Collection[/Iterable] that you like.
(2) Just declare them to return Collections and then return whatever
type you like. The implementation class can document what those are
returning, if you think it's appropriate.
This may sound funny coming from me, given all the method signatures
that Tree has in JUNG (ref:
http://jung.sourceforge.net/doc/api/edu/uci/ics/jung/graph/Tree.html),
but I recommend making the interface as small as possible, by not
including any operations that can be composed of other operations
(i.e., things that can be done by manipulations of the public API)
unless the externally driven version would be inherently less
efficient. For example:
* graft(), traverse(), asMap() can be static utility methods; they
don't require knowledge of the internal data structures and I wouldn't
expect them to be substantially faster in general (i.e., no guarantee
that the internal storage will be the appropriate Map)
* ditto leaves(), unless you use extra internal storage to keep track
of them (and not all implementations will want to do that)
* removeAll() and putAll() are syntactic sugar for the obvious loops
putRoot() is conceptually problematic. If you call putRoot() after
any other put*() operations, what happens? (JUNG has the same
problem. But it's worth considering if there's a better way. One
option: leave putRoot() out entirely and document that implementations
_must_ provide a root as a constructor arg. Another: leave putRoot()
out and use put(null, root) to start things off.)
You provide a way to return the height of the tree, but not the depth
of any vertex.
Nitpick: technically--in the graph-theoretical sense--what you're
defining is a directed rooted tree. (A 'tree' is just an acyclic
connected graph.) I called JUNG's directed rooted tree 'Tree' myself
anyway, of course, but you may want to consider whether you want
Hope this is helpful.
Joshua
> --
> guava-...@googlegroups.com.
> http://groups.google.com/group/guava-discuss?hl=en
> unsubscribe: guava-discus...@googlegroups.com
>
> This list is for discussion; for help, post to Stack Overflow instead:
> http://stackoverflow.com/questions/ask
> Use the tag "guava".
>
--