Currently NetworkX uses PyGraphViz or PyDot to read/write graphs in DOT
format. However, both options suffer from fatal drawbacks:
1) PyGraphViz is very hard/impossible to install on MS Windows.
2) PyDot requires too much memory (for some 5000 nodes/100000 edges
digraph, PyDot did consume 2GB and died).
Is it possible that NetworkX implements it's own, fast and efficient DOT
serialization procedures? The complete DOT file format specification it
probably not trivial, but for me naive readdot/writedot (see below) did
work surprisingly well. I believe a little less naive version would suffice
most users working with large graphs.
What do you think?
Andrey Paramonov
def readdot(dotfile):
"""A (very) naive GraphViz DOT file reader.
dotfile: file-like object,
returns networkx.DiGraph."""
def parseattrs(line):
if line:
attrs = OrderedDict()
for rec in line.split(', '):
(key, value) = rec.split('=')
attrs[key] = value
return attrs
else:
return
head = re.compile(r'strict digraph ("?)([^"]+)(\1) {$')
node = re.compile(r'([^-]+?)( \[(.+)\])?;$')
edge = re.compile(r'([^-]+?)->([^-]+?)( \[(.+)\])?;$')
foot = re.compile(r'}')
for line in dotfile.readlines():
m = head.match(line)
if m:
graph = nx.DiGraph(name = m.group(2))
m = node.match(line)
if m:
graph.add_node(m.group(1), parseattrs(m.group(3)))
m = edge.match(line)
if m:
graph.add_edge(m.group(1), m.group(2), parseattrs(m.group(4)))
m = foot.match(line)
if m:
return graph
def writedot(graph, dotfile):
"""A (very) naive GraphViz DOT file writer.
graph: networkx.DiGraph,
dotfile: file-like object."""
def writerec(rec, attrs):
dotfile.write(rec)
if attrs:
dotfile.write(' [' + ', '.join('='.join([key, value])
for key, value in
attrs.iteritems()) + ']')
dotfile.write(';\n')
dotfile.write('strict digraph "' + graph.name + '" {\n')
for node, attrs in graph.nodes(data = True):
writerec(node, attrs)
for u, v, attrs in graph.edges(data = True):
writerec('->'.join([u, v]), attrs)
dotfile.write('}\n')