Config params for the Java GraphChi Program

64 views
Skip to first unread message

Wissam K.

unread,
May 20, 2014, 5:28:55 PM5/20/14
to graphchi...@googlegroups.com
Hi Aapo / Danny,

How would you go about including these configuration parameters to a GraphChi program written in Java:

  • execThreads: I have tried including -Dnum_threads=my_value, but it does not seem to take affect?!
  • niothreads
  • membudget: engine.setMemoryBudgetMb() seems to do the job!?
  • number of cores used on the machine

Is there way to have the engine/Context echo back those values after execution just to double check that they have been taken into account?

Also, I am not quite sure I understand exactly what engine.setEnableDeterministicExecution() does exactly?


Thank you very much and Thank you for Graphchi!

-Wissam

Aapo Kyrola

unread,
May 21, 2014, 1:50:29 PM5/21/14
to graphchi...@googlegroups.com
Hi Wissam,

I am currently moving to UK, so I don't have time to do changes to GraphChi. However, let me give you some idea how you can do it yourself.

On May 20, 2014, at 2:28 PM, Wissam K. <wissam...@gmail.com> wrote:

  • execThreads: I have tried including -Dnum_threads=my_value, but it does not seem to take affect?!


It should take effect...  Check that you have the parameter in the right place of the command line (should come before the class name).
There should be output like :::::::: Using " + nprocs + " execution threads :::::::::

  • niothreads
    In the Java-version, the I/O is handled differently so you cannot adjust this.

    However, you can adjust the number of loading threads (essentially, how many shards are being loaded in parallel) on line 240 of GraphChiEngine.java:
    loadingExecutor = Executors.newFixedThreadPool(4);
     
    ... change 4 to your configuration.

    • membudget: engine.setMemoryBudgetMb() seems to do the job!?

    Yes

    • number of cores used on the machine


    Runtime.getRuntime().availableProcessors()

    Is there way to have the engine/Context echo back those values after execution just to double check that they have been taken into account?


    Should be quite easy...

    Also, I am not quite sure I understand exactly what engine.setEnableDeterministicExecution() does exactly?



    Thank you very much and Thank you for Graphchi!

    -Wissam

    --
    You received this message because you are subscribed to the Google Groups "graphchi-discuss" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to graphchi-discu...@googlegroups.com.
    To post to this group, send email to graphchi...@googlegroups.com.
    To view this discussion on the web visit https://groups.google.com/d/msgid/graphchi-discuss/1dd9ad24-2397-4f8c-b2b6-e1514812f9ab%40googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

    Aapo Kyrola
    Ph.D., Carnegie Mellon University (2014)
    http://www.cs.cmu.edu/~akyrola
    GraphChi: Big Data - small machine: http://graphchi.org
    twitter: @kyrpov

    Wissam K.

    unread,
    May 23, 2014, 5:43:56 PM5/23/14
    to graphchi...@googlegroups.com
    Hi Aapo,

    Thanks for your reply. This is what I need.

    I am experimenting with the Twitter 2010 links dataset. When loading for execution, I get this error:

    java.lang.ArrayIndexOutOfBoundsException: 3
    at edu.cmu.graphchi.ChiVertex.addOutEdge(ChiVertex.java:217)
    at edu.cmu.graphchi.shards.SlidingShard.readNextVertices(SlidingShard.java:199)
    at edu.cmu.graphchi.engine.GraphChiEngine$4.run(GraphChiEngine.java:649)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)


    And it happens only for a few Memshards. I checked the GraphChi code but there does not seem to be any issues with the logic.

    Any pointers?

    Thanks very much Aapo.

    Wissam K.

    unread,
    May 23, 2014, 5:46:02 PM5/23/14
    to graphchi...@googlegroups.com
    P.S. I thought this might put the error in a better context:

    2:15:38 PM engine loadBeforeUpdates - t:1 INFO:   Memshard: 50000000 -- 70000000
    2:15:47 PM engine loadBeforeUpdates - t:1 INFO:   Loading memory-shard finished.main
    java.lang.ArrayIndexOutOfBoundsException: 3
    at edu.cmu.graphchi.ChiVertex.addOutEdge(ChiVertex.java:217)
    at edu.cmu.graphchi.shards.SlidingShard.readNextVertices(SlidingShard.java:199)
    at edu.cmu.graphchi.engine.GraphChiEngine$4.run(GraphChiEngine.java:649)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
    2:15:52 PM engine loadBeforeUpdates - t:1 INFO:   Still waiting for loading, counter is: 37

    Aapo Kyrola

    unread,
    May 24, 2014, 1:28:35 AM5/24/14
    to graphchi...@googlegroups.com
    Hi,

    have you done some changes to the code? This could be caused by a race condition. Alternatively, there are some incorrect preprocessed shards. Good idea to delete them all: "rm graphname.*" and then rerun.

    If you did some changes, send me a diff and I can have a look.

    Aapo


    For more options, visit https://groups.google.com/d/optout.
    Reply all
    Reply to author
    Forward
    0 new messages