Datasize for Graphlab to be effective in distributed setting

Prashanth

unread,

Apr 14, 2013, 8:34:50 PM4/14/13

to graph...@googlegroups.com

Hi,

We have written a program to find 3-hop neighbors for each vertex in the graph using graphlab. We executed the program for a small input of 10000 vertices where each vertex connects to 100 other vertices.(1 million edges). We observed that the program runs faster in a 2-node cluster than in a 8-node cluster(We are using Amazon EC2).

Are we doing something wrong or the datasize is too small ?

I am attaching our program.

I used this command to execute in the cluster
mpiexec -n <number_of_nodes> ./meng --graph=hdfs://`hostname`/graph_input.txt --saveprefix=graph_output --iterations=3 --ncpus=2

Thanks in advance,
Prashanth

meng.cpp

Danny Bickson

unread,

Apr 15, 2013, 5:06:19 AM4/15/13

to graph...@googlegroups.com

Hi Rashanth,

Please take a look here: http://graphlab.org/fine-tuning-graphlab-performance/

and follow steps 1,2,3 to verify you deploy graphlab correctly.

If you still have performance issues send us detailed report as explained on the website above. Note that your problem is rather small for obtaining an appropriate speedup. You may need to increase the dataset size.

Best,

Dr. Danny Bickson

Project Scientist, Machine Learning Dept.

Carnegie Mellon University

--
You received this message because you are subscribed to the Google Groups "GraphLab API" group.
To unsubscribe from this group and stop receiving emails from it, send an email to graphlabapi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yucheng Low

unread,

Apr 15, 2013, 4:36:25 PM4/15/13

to graph...@googlegroups.com

Hi,

Your task is extremely communication bound.

Which EC2 instances are you using? It will be preferred to use the cluster compute instances which have a lot faster interconnects.

Also, scanning through your code, we do have built in serializers for std::set, std::map (and recursively so)

so the serialization code can be simplified quite a bit. (like you should be able to save the neighbor_map directly using oarc << neighbor_map.

Are you using the repository version of GraphLab? If not, which version did you download?

A number of significant network optimizations landed some time in January/February.

Yucheng

--
You received this message because you are subscribed to the Google Groups "GraphLab API" group.
To unsubscribe from this group and stop receiving emails from it, send an email to graphlabapi...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

<meng.cpp>

Message has been deleted

Prashanth

unread,

May 6, 2013, 8:28:28 PM5/6/13

to graph...@googlegroups.com

Hi Yucheng,

Thanks for the reply.
We were using m1.xlarge till now. But when we tested for a datasize of 2GB we got an error for which I have attached the logs.
We will try using the cluster compute instances next.

--------------------------------------------------------------------------
mpiexec noticed that process rank 1 with PID 2059 on node ip-10-38-79-142.ec2.internal exited on signal 9 (Killed).
--------------------------------------------------------------------------

Any Idea why do we get the above error. This happens after the graph has finalized and the first iteration is starting.
We also noticed that, the graph gets loaded into the hdfs at all the instances in the cluster. Should this happen ?

Regards,

screenlog.0

Yucheng Low

unread,

May 6, 2013, 8:48:44 PM5/6/13

to graph...@googlegroups.com

Hi,

It looks your MPI is not set up properly.
It is running 2 completely independent instances of GraphLab. Thus each file is loaded twice; once on each machine.
That error means it probably ran out of memory after the first iteration.

See http://graphlab.org/fine-tuning-graphlab-performance/

Yucheng

On 05/06/2013 05:28 PM, Prashanth wrote:

Danny Bickson

unread,

May 7, 2013, 2:16:53 AM5/7/13

to graph...@googlegroups.com

Hi Prashanth!

A good MPI tutorial is found here here:http://source.ggy.bris.ac.uk/mediawiki/index.php?title=Install_and_configure_MPI&redirect=no

Please verify you follow step 2 here: http://graphlab.org/fine-tuning-graphlab-performance/

If there are any issues, please email us again with the details and we will further help you investigate your problem.

Best,

Dr. Danny Bickson

Project Scientist, Machine Learning Dept.

Carnegie Mellon University

Prashanth

unread,

May 7, 2013, 11:47:52 AM5/7/13

to graph...@googlegroups.com

Hi Danny,

I followed the instructions from your blog to run my program on EC2 cluster. http://bickson.blogspot.com/2012/10/deploying-graphlabsparkmesos-cluster-on.html

Do I have to setup MPI explicitly if I am running on EC2 ?

As suggested I will follow the links that you and Yucheng gave me. Thank you.

Prashanth

unread,

May 9, 2013, 9:02:15 PM5/9/13

to graph...@googlegroups.com

Hi,

I followed the link http://graphlab.org/fine-tuning-graphlab-performance/ and verified that MPI is setup properly.
I had missed out giving the "machines" argument while running the program and hence the graph was loading to hdfs in all the machines separately.

I am attaching the logs of the execution of the program. After first iteration, I get the following error.
Connection to ec2-50-17-123-14.compute-1.amazonaws.com closed by remote host.
Traceback (most recent call last):
File "./gl_ec2.py", line 736, in <module>
    main()
File "./gl_ec2.py", line 616, in main
    \"""" % (opts.identity_file, proxy_opt, master), shell=True)
File "/usr/lib/python2.7/subprocess.py", line 511, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'ssh -o StrictHostKeyChecking=no -i /home/prashanth/.ssh/graphlab.pem ubu...@ec2-50-17-123-14.compute-1.amazonaws.com "export PATH=$PATH:/opt/hadoop-1.0.1/bin;
        export CLASSPATH=$CLASSPATH:.:\`hadoop classpath\`;
        export JAVA_HOME=/usr/lib/jvm/java-6-sun;
        cat ~/machines
        mpiexec.mpich2 -f ~/machines -envlist CLASSPATH -n 7 /home/ubuntu/graphlabapi/release/toolkits/meng/set --graph=hdfs://\`head -n 1 ~/machines\`/input --iterations=3 --topic=0;
        "' returned non-zero exit status 255

Yucheng mentioned that this is due to running out of memory. But I am using a m1.xlarge which has 15Gb of ram and I am running 7 instances of slaves.
My question is, whose memory is running out ? Does all the intermediate data between iterations stored in Master's RAM or is it distributed among slaves or is it written to HDFS temporarily ?

My program finds all the neighbors at 3 degrees of separation of all the vertices. I am attaching the program as well with changes in load and save functions as suggested by Yucheng. I also reduced the amount of data flowing between vertices to overcome the running out of memory issue.

Thanks a lot in advance,
Prashanth

myapp.cpp

Logs.txt

Haijie Gu

unread,

May 9, 2013, 9:14:48 PM5/9/13

to graph...@googlegroups.com

Hi Prashanth,

The vertex data is distributed across all nodes (including master and all the slaves). Though it is still likely that the 3-hop neighbor of some vertex on some machine grows too big. I know Facebook graph has diameter ~4, not sure what's your case, but each vertex_data essentially stores half of the entire graph. One way to verify this is to set a maximum size on the neighborhood set of each vertex.

Best,

-jay

<Logs.txt><myapp.cpp>

Reply all

Reply to author

Forward