Spargel node degree

14 views
Skip to first unread message

m.neuma...@gmail.com

unread,
Jun 1, 2014, 11:14:12 PM6/1/14
to stratosp...@googlegroups.com
Hej,

I'm implementing PageRank using Spargel in the send messenge step I need to devide the Node value by the degree of the Node.

Can I get the degree of a node somehow, or do I have to Store it on the node). I tried the code below but that gives me an exception at runtime telling me its illegal to go through the edges twice.

public void sendMessages(String vertexId, Double newRank) {
int numOutEdge = 0;

for (Iterator<OutgoingEdge<String, Double>> iterator = getOutgoingEdges()
.iterator(); iterator.hasNext();) {
iterator.next();
numOutEdge++;
}
for (OutgoingEdge<String, Double> edge : getOutgoingEdges()) {
sendMessageTo(edge.target(), newRank * 1 / numOutEdge);
}
}


cheers Martin

Stephan Ewen

unread,
Jun 2, 2014, 2:43:23 AM6/2/14
to stratosp...@googlegroups.com
Yes, the edges are "streamed". In theory that protects you from problems with large edge lists but in practice that is not really an issue.

I think you can simply buffer them in an array.

Martin Neumann

unread,
Jun 2, 2014, 6:18:08 AM6/2/14
to stratosp...@googlegroups.com
I guess that will depend on the Graph structure a power law graph (e.g. internet webgraph) might be a problem if you have to hold one of the supernodes in memory.

I will go with buffering right now, if I run into problems I will store the degree as node value.


On Mon, Jun 2, 2014 at 4:43 AM, Stephan Ewen <se...@apache.org> wrote:
Yes, the edges are "streamed". In theory that protects you from problems with large edge lists but in practice that is not really an issue.

I think you can simply buffer them in an array.

--
You received this message because you are subscribed to a topic in the Google Groups "stratosphere-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stratosphere-dev/Ct3fbS4az_o/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stratosphere-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/stratosphere-dev.
For more options, visit https://groups.google.com/d/optout.

Fabian Hueske

unread,
Jun 2, 2014, 8:29:34 AM6/2/14
to stratosp...@googlegroups.com
Jep, for nodes with very many connection you might run out of memory.

An alternative solution could be to fill a fixed sized buffer and write the buffer it to disk once it is filled up. That way most processing will happen in memory and the heavy-hitters might go to disk but your program would not die due to lack of memory. It's also not an ideal solution because you need to write the code, but kind of what the engine would have to do under the hood if a complete group doesn't fit into memory.


--
You received this message because you are subscribed to the Google Groups "stratosphere-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stratosphere-d...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages