Usage of the EdgeTypeLoader in the API

37 views
Skip to first unread message

oj

unread,
Jan 20, 2015, 6:51:10 PM1/20/15
to spar...@googlegroups.com
Hi, 

I have successfully added 'User' nodes to a graph using the CSVReader and NodeTypeLoader classes.

Now I want to add edges using the EdgeTypeLoader class. Its usage is not very clear.

If I have a simple edges.csv file with someUserId, anotherUserId  format which represents friendship among the two users, how do I construct the EdgeTypeLoader object?

     int friendType = graph.newRestrictedEdgeType("FRIENDOF", UserTypeId, UserTypeId, false);    
    CSVReader csv = new CSVReader();
    csv.setSeparator(",");
    csv.setStartLine(1);
    csv.open("edges.csv");
   
    AttributeList attrs = new AttributeList();
    Int32List attrPos = new Int32List();
     attrs.add(graph.findAttribute(friendType , "to"));
    attrPos.add(1);    
    attrs.add(graph.findAttribute(friendType , "from"));
    attrPos.add(2);

        EdgeTypeLoader etl=new EdgeTypeLoader(csv, graph, friendType , attrs, attrPos, ?, ?, ?, ?);

I've tried several options but they seem to either throw an exception or throw an Invalid Attribute error.

Thanks!


c3po.ac

unread,
Jan 21, 2015, 5:29:00 AM1/21/15
to spar...@googlegroups.com

Hi,

In your example, let's assume that the columns of "edges.csv" are this ones:

someUserId, anotherUserId, AnEdgeAttribute, AnotherEdgeAttribute, OneMoreEdgeAttribute



The EdgeTypeLoader declaration is like this:

EdgeTypeLoader(RowReader rowReader, Graph graph, int type, AttributeList attrs, Int32List attrsPos, int hPos, int tPos, int hAttr, int tAttr)


In your example:

  • type: Would be the type of the edge
  • attrs: A list containing the attribute type of each EDGE attribute that you want to load (The attr type of: AnEdgeAttribute, AnotherEdgeAttribute and OneMoreEdgeAttribute).
  • attrsPos: A list where each item contains the column index of the corresponding attribute in the "attrs" list (In the example: 2, 3, 4).
  • hPos: The index of the column in the csv containing the node that should be the "head" of the new edge. (In the example: 1 assuming the edge goes from someUserId to anotherUserId).
  • tPos: The index of the column in the csv containing the node that should be the "tail" of the new edge. (In the example: 0 assuming the edge goes from someUserId to anotherUserId).
  • hAttr: The attribute type of the column specified in hPos (In the example, the attribute type of anotherUserId).
  • tAttr: The attribute type of the column specified in tPos (In the example, the attribute type of someUserId).

If you don't have edge attributes, you could use empty lists in attrs and attrsPos.

Best regards.


El dimecres, 21 gener de 2015 0:51:10 UTC+1, oj va escriure:

frank....@gmail.com

unread,
Feb 15, 2016, 6:52:18 AM2/15/16
to Sparksee
Hi,
based on your indications and the UserManual-API: http://sparsity-technologies.com/UserManual/API.html#api
I made the following class instantation:
EdgeTypeLoader *etl = new EdgeTypeLoader(relsCsv, *graph, relType, relAttrs, relAttrPos, 1, 0, type_t, type_t);


g++ -O3 -std=c++11 -I ../includes/sparksee CSVLoader.cpp -oCSVLoader -L ../lib/linux64 -lsparksee
CSVLoader.cpp: In function ‘int main(int, char**)’:
CSVLoader.cpp:83:104: error: expected primary-expression before ‘,’ token
   EdgeTypeLoader *etl = new EdgeTypeLoader(relsCsv, *graph, relType, relAttrs, relAttrPos, 1, 0, type_t, type_t);
                                                                                                        ^
CSVLoader.cpp:83:112: error: expected primary-expression before ‘)’ token
   EdgeTypeLoader *etl = new EdgeTypeLoader(relsCsv, *graph, relType, relAttrs, relAttrPos, 1, 0, type_t, type_t);
                                                                                                                ^
@pc:~/sparkseecpp-5.2.0/$ g++ -O3 -std=c++11 -I ../includes/sparksee CSVLoader.cpp -oCSVLoader -L ../lib/linux64 -lsparksee
CSVLoader.cpp: In function ‘int main(int, char**)’:
CSVLoader.cpp:83:104: error: expected primary-expression before ‘,’ token
   EdgeTypeLoader *etl = new EdgeTypeLoader(relsCsv, *graph, relType, relAttrs, relAttrPos, 1, 0, type_t, type_t);
                                                                                                        ^
CSVLoader.cpp:83:112: error: expected primary-expression before ‘)’ token
   EdgeTypeLoader *etl = new EdgeTypeLoader(relsCsv, *graph, relType, relAttrs, relAttrPos, 1, 0, type_t, type_t);

Please look at the attached edgesToLoad.csv, nodesToLoad.csv and CSVLoader.cpp

How to correctly set etl?
Frank
nodesToLoad.csv
CSVLoader.cpp
edgesToLoad.csv

c3po.ac

unread,
Feb 15, 2016, 8:48:15 AM2/15/16
to Sparksee

Hi,

Before loading the nodes, all the attributes to be loaded must be created.

In your edges loading sample code the name of the file used in the code contains an space that you probably don't have in the real file name.

The  "Friend" edge type name in the edges csv should never be used because the type of the edge is specified in the EdgeTypeLoader constructor.

The relAttrs and  relAttrPos arguments should contain the id of the edge attributes that you want to load (strength and distance ) and the column position in the csv file (3 and 4).

The last two arguments of the EdgeTypeLoader constructor are the attribute types of the NODE attributes that must be used to find the head and tail nodes using the content of the columns hPos and tPos (the previous two arguments: 1, 0 in your sample) of each row.

I attach your code slightly modified.

Best regards


El dilluns, 15 febrer de 2016 12:52:18 UTC+1, frank....@gmail.com va escriure:
CSVLoader.cpp

frank....@gmail.com

unread,
Feb 15, 2016, 9:24:05 AM2/15/16
to Sparksee
Thank you very much for your kind explanation.
Last point to clarify.
You say the "Friend" edge type name in the edsge.csv should never be used, because the type of the edge is specified in the EdgeTypeLoader constructor.
Does it imply that the edgesToLoad.csv has to be in this way:
1)
FirstUserID; SecondUserID; Relationship; Strength; Distance
1; 2; Rel; Strong; Close
2; 3; Rel; Weak; Distant
3; 4; Rel; Strong; Distant
1; 4; Rel; Weak, Distant

or in this way (without the Relationship column)?
FirstUserID; SecondUserID; Strength; Distance
1; 2; Strong; Close
2; 3; Weak; Distant
3; 4; Strong; Distant
1; 4; Weak, Distant

Frank

frank....@gmail.com

unread,
Feb 15, 2016, 9:52:32 AM2/15/16
to Sparksee
About the fact the  "Friend" edge type name in the edges csv should never be used because the type of the edge is specified in the EdgeTypeLoader constructor, a major question is: how to deal with multiple types of relationship among nodes?

Example:
edgesToLoad2.csv :

FirstUserID; SecondUserID; Relationship; Strength; Distance
1; 2; Friend; Strong; Close
2; 3; Relative; Weak; Distant
3; 4; Spouse; Strong; Close
1; 4; Friend; Weak, Distant

For instance,if, as the above .csv file, we have three different types of edges among nodes, how can we actually specify all of them in the EdgeTypeLoader constructor, and, more importantly, how we can correctly assign each of these edge types to the proper node pairs, if we cannot extract this pivotal information from the csv file?

Frank

c3po.ac

unread,
Feb 15, 2016, 10:16:12 AM2/15/16
to Sparksee
 
Hi,

The EdgeTypeLoader is a class build to load the edges of a specific edge type from a csv file. That's the most common usage and the most efficient if you have big csv files.
You can skip columns (like the "friends" column in the edited sample code), but you can't skip rows or insert edges of different types.

If your csv file contains information from different edge types, you can split the csv file in multiple csv files (one for each edge type) before loading them.

If you don't want to externally split the input files, then your best option is to build your own custom edge loader. You can use the same CSVReader (or read the file with any other method) and for each row can create the edge and and set the attribute values. You will need to write more code, but it will be the best way to load your csv file.

Best regards


El dilluns, 15 febrer de 2016 15:52:32 UTC+1, frank....@gmail.com va escriure:
Reply all
Reply to author
Forward
0 new messages