Some simple and fundamental questions for newb onboarding

66 views

Skip to first unread message

Matan Safriel

unread,

Sep 21, 2015, 2:35:47 PM9/21/15

to Cassovary

Hi,

I looked through some of the code, examples, github issues, and would like to confirm my overall understanding about Cassovary.

Is it the case that the graph api can "only" store the id for each node, and unlabeled edges? I saw some enhancement requests for changing this, yet it seems that some of them can be satisfied by pre-processing your graph in certain ways, or by employing lookup tables external to the graph; hence no need to change the core IMO.
What have been the main optimizations making it especially efficient in terms of memory footprint? Did they substantially trade off with cpu usage? are they as efficient with Java 8 runtime?
I see several types of graphs in com.twitter.cassovary.graph, and have probably missed a definitive resource explaining the salient traits of each one. I am not even sure how is a Dynamic Directed Graph different than a Directed Graph, or what is shared in a SharedArrayBasedDirectedGraph. Have I missed some piece of documentation? can you point out rules of thumb for choosing between them, or what is each one optimized for?

Many thanks in advance for your comments!

This all will greatly enable a wise choice on my part.

Thanks,

Matan

Pankaj Gupta

unread,

Sep 22, 2015, 1:06:33 AM9/22/15

to twitter-...@googlegroups.com

Hi Matan,

Answers inline:

On Mon, Sep 21, 2015 at 11:35 AM, Matan Safriel <dev....@gmail.com> wrote:

Hi,

I looked through some of the code, examples, github issues, and would like to confirm my overall understanding about Cassovary.
Is it the case that the graph api can "only" store the id for each node, and unlabeled edges? I saw some enhancement requests for changing this, yet it seems that some of them can be satisfied by pre-processing your graph in certain ways, or by employing lookup tables external to the graph; hence no need to change the core IMO.

There is support for storing arbitrary data for nodes and edges in the form of Labels but this support is recent and only currently wired in for nodes. However, this has been on the todo list for a bit and happy to help support your use case by doing more wiring.

See https://github.com/twitter/cassovary/blob/master/cassovary-core/src/main/scala/com/twitter/cassovary/graph/labels/Label.scala

https://github.com/twitter/cassovary/blob/master/cassovary-core/src/main/scala/com/twitter/cassovary/graph/DirectedGraph.scala#L67

What have been the main optimizations making it especially efficient in terms of memory footprint? Did they substantially trade off with cpu usage? are they as efficient with Java 8 runtime?

Primarily avoiding using boxing wherever possible in storage ( the use of arrays of primitive ints to store edges for example). No. And yes, why would Java 8 runtime be different?

I see several types of graphs in com.twitter.cassovary.graph, and have probably missed a definitive resource explaining the salient traits of each one. I am not even sure how is a Dynamic Directed Graph different than a Directed Graph, or what is shared in a SharedArrayBasedDirectedGraph. Have I missed some piece of documentation? can you point out rules of thumb for choosing between them, or what is each one optimized for?

Best way to browse them is to look at the comments right above the definition of those traits. e.g., DynamicDirectedGraph adds dynamic operations to the trait DirectedGraph

Many thanks in advance for your comments!

This all will greatly enable a wise choice on my part.

Feel free to talk about what you are looking for and perhaps we can help.

cheers

Thanks,
Matan

--
You received this message because you are subscribed to the Google Groups "Cassovary" group.
To unsubscribe from this group and stop receiving emails from it, send an email to twitter-cassov...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages