Introducing graph: One data-structure to rule them all

16 views
Skip to first unread message

Dilawar Singh

unread,
Jul 10, 2014, 2:58:24 AM7/10/14
to pn...@googlegroups.com
A useful way to look at computer program is: there is data arranged in certain ways and functions try to rearrange/modify it. When we are satisfied with rearrangements and modifications, we stop. This is not something one writes in exam and will be considered correct but it is a useful picture.

This view of program laid emphasis on data vis-a-vis functions or algorithms. Often, functions are given primary importance, and data is considered secondary. This is why students know more about functions than they know about containers/iterators after doing  their first course in programming. Perhaps the real reason is that programming had been a CS forte so far. CS has its own data-structure course running in parallel along with "introduction to programming". Others borrows more from 'introduction to programming' and very little form 'data-structures'. This can lead to an impression that data-structures are not as important as functions. Perhaps this is not important when so much can be done just with arrays (Matlab/Scilab/C).

Container is an abstract concept: it holds data/data-structures. Merely holding data is not enough, we need to specify how data should be arranged so that we can access data in some definite way. Usually we don't know the exact location of a particular element and we have to "travel the container" to find it. The traveling part is accomplished using what is called iterators. Think C++ where each container (vector, list, set, map) comes with its own iterators. These containers are traversed sequentially: second comes after first, third after second etc. They allow us to store any type of data but we can only arrange it "along a line" or "in a plane" . They are not general enough to arrange data in any way we like.

Graphs can be used to arrange data in any order we like (not just in line or in a plane, but more). We usually attach data to nodes or edges. Edges defines the arrangement pattern and can also define the direction in which traveling is possible on them. This flexibility of arranging data makes them very attractive. Their presence is rather ubiquitous in all branches of natural sciences. All self-respecting programming languages which claim to "compute" comes with a heavy graph support in its libraries [1].  A good and lucid introduction to graph is "Graph theory: with applications to engineering and computer science" by Narsing Deo.

In C++, two prominent libraries are boost-graph and LEDA graph. LEDA came out of academia and it was the standard before the advent of boost. LEDA is not open-source anymore but a free-version is available. Compared to boost, it is easier to use and compiles faster. Its documentation is great. Boost, on the other hand, in under constant development and fairly advanced. It has more functionality than LEDA. Unfortunately, boost is rather heavily dependent on advanced C++ (Templates, Meta-programming, etc), and it makes it rather frightening to even a decent programmer of C++. If you plan to use graph in any of your academic project, prefer LEDA. With such a heavy dose of templates, the error message generated by compiler does not make much sense to most of us. If you are comfortable with templates and generic-programming in C++ then boost should be your first choice. Boost documentation is relatively poor than LEDA.

Python has networkx which is great for all practical purpose for graphs of modest size. It is pure-python which means it is very slow. Python graph-tool is another option. It is bindings to boost-graph and therefore it is as fast as boost graph. And you get Python ease and expressiveness for free. It is available in repositories of all major Linux and perhaps for Windows too. If you are not into programming languages but want to learn at least one for building stuff, Python can be a great choice.

To store graph, you can use graphviz format which is now extensively used. To visualize graph, you can use graphviz again. Programs 'dot', 'neato' etc. can be used to turn a graph file to a fine graph in various formats (png, eps, pdf, ps, jpg etc). See this collection of large graphs stored and plotted using graphviz. Another option is to store graph in graphml format which is XML. This might even help drawing figures for report.

Knuth made a graph-library called 'Stanford Graph Base' in C.


[1] Graphs which are also trees have a well-defined recursive nature. With some effort, directed acyclic graphs (DAG) can also be defined recursively. But a general graph is too costly to define recursively. That is why Graphs and functional programming languages, so reputed to handle recursion in a very elegant manner, do not mix easily.

Dilawar

Arunabha Sarkar

unread,
Jul 10, 2014, 2:03:43 PM7/10/14
to pn...@googlegroups.com
Hi Dilawar,

Can you giv an example/application of graph type data + structure or networkx. So far, the usual stuff like simulating Sieve of Eratothenes, getting Protein code from a DNA input code doesn't require anything beyond 'list'. Once amateurs like me will know the powers of 'graph', perhaps I can think about using it is potential problem.

Just some link redirecting to a good example showcasing the powers & usage of graph concept or networkx will be good enough to start with

Arunabha

Dilawar Singh

unread,
Jul 10, 2014, 2:30:17 PM7/10/14
to pn...@googlegroups.com

Gladly. For instance, let assume that I am simulating a network of Neuron where each neuron making connection with other neurons. The connection can be quite random. And I want to visualize it. Following is the real example from moose simulator: a network of 10 level purkinje neuron (approximated as a binary tree).


Some more examples to follow. Graphs shines when you are dealing with network of any type. 

Dilawar Singh

unread,
Jul 12, 2014, 12:54:29 AM7/12/14
to pn...@googlegroups.com

Using graph theory to analyze biological networks


http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3101653/



On Thursday, July 10, 2014 11:33:43 PM UTC+5:30, Arunabha Sarkar wrote:
Reply all
Reply to author
Forward
0 new messages