The code is part of a large Python library that I wrote and it would
be hard to string it out. So, I'd be happy to explain it more.
The aims of the algorithm are
1) Take a set points, let's say 500 of them for example. The points
are distinct from one another by their position in a 2D space, x and
y.
2) We then construct a MST of the set. The edges that connect the
points have a length since the points are in 2D space. In this case we
can also call the edges the weights, which help determine the MST.
Let's assume that the edges have lengths from 1 to 100.
3) The MST connects each and every point by a set of edges. Now, let's
throw away any edge that exceeds a length of 10. The result would show
something like the plot on the right hand side (follow link). The
black lines are edges that remain after moving edges larger than, for
example, 10. The blue lines are the edges that have been removed.
http://networkx-discuss.googlegroups.com/web/mst_and_broken_mst.png?gda=n4GL9EgAAABqQvvEw0AzKLQQpQGnIfbWCYSu6MlEdg_vFL9AqW_0laCE_Zhn-hcfg0nYfnSrnDglzhb83kORdLwM2moY-MeuGjVgdwNi-BwrUzBGT2hOzg&gsc=J5p-2SMAAAAni4LF7orUp1-ArCx8ir0F5d7BohKsWzM3CiDwwzdLiq0IoyLhPG2x5smOr2otMGI
Unlike before when the all the edges were still there, not all of the
points are connected now. They are broken up into sets of edge-
connected points.
4) Taking the data, which is comprised of points and their edges, I
can deduce which points form a distinct set from others.
Unfortunately, the algorithm I have written is slow in Python. This
part of the code is still being tested and it's not part of my library
yet. It's posted below
The code may be quite cumbersome, but it's what I have come up with
for the moment. I will look at the link you provided to see if it's
useful. Thank you for your help.
Cheers,
Eli
### Code Begin ###
def _super_intersection(edges):
# This function is used inside the cluster function below.
# It works by cycling through
group = set(edges[0])
index = np.array([0])
k = 0
while k < 100:
k += 1
i = 0
for edge in edges[1:]:
i += 1
edge = set(edge)
if group & edge:
group = group | edge
index = np.append(index, i)
index = np.unique(np.array(index))
return group, index
def cluster(self, gmin = 5):
# A 2xN array of node IDs
# gmin is the minimum number of points needed to be
# considered a cluster (the set of points that are connected).
# An edge is identified by two points. So the edges variable
# is an array, N x 2, where each element is a unique point
# identified by an ID number (integers in this case).
edges = self.edges
group_nodes = {}
for no, edge in enumerate(edges):
try:
group, indice = _super_intersection(edges)
id_no = no
edges = np.delete(edges,indice,0)
if len(group) >= gmin:
group_nodes[id_no] = list(group)
except:
self.group_nodes = group_nodes
### Code End ###
On Jul 28, 4:21 pm, Aric Hagberg <
ahagb...@gmail.com> wrote: