Major performance improvement to GraphChi C++

36 views
Skip to first unread message

Aapo Kyrola

unread,
Oct 22, 2013, 2:58:41 AM10/22/13
to graphchi...@googlegroups.com, graph...@googlegroups.com
Hi,

the latest github version of GraphChi improves the performance considerably. At least connected components example application on my Macbook  runs now almost two times faster than before! On pagerank the improvement is smaller, perhaps 25%, but still significant.  The speedup probably varies with different graphs and apps.

The trick is that now the in-edges are loaded in parallel (I don't know why I did not do this before...). 

In addition, GraphChi now allows you to cache edge data. In the configuration there is a "cachesize_mb" setting that can be used to adjust this. It can be set on command line as well. This can be useful if you have plenty of memory, otherwise it does not help much. 

Please let me know if you encounter any problems.

Aapo Kyrola
Ph.D. student, http://www.cs.cmu.edu/~akyrola
GraphChi: Big Data - small machine: http://graphchi.org
twitter: @kyrpov

Reply all
Reply to author
Forward
0 new messages