Adding edges to existing vertices

126 views
Skip to first unread message

Adhitya Rajagopalan

unread,
Jul 22, 2016, 2:37:58 AM7/22/16
to Aurelius
Hi,

I have a Titan 1.0 graph around 120K vertices with 30 properties each (running with Cassandra storage backend and ElasticSearch for indexing). Now, I need to establish the edges between the vertices based on the properties of the edges.
Eg) The vertices have a properties called level which has values like 1.1,1.2,1.3..... and 1.1.1, 1.1.2, 1.2.1, 1.2.2.......
I have to create edges between {1.1 and 1.1.1}, {1.1 and 1.1.2}, {1.2 and 1.2.1}, {1.2 and 1.2.2}......

How do I achieve this?
Thanks in advance

Yours,
Adhitya 

Daniel Kuppitz

unread,
Jul 23, 2016, 10:29:15 AM7/23/16
to aureliu...@googlegroups.com
Given this sample graph:

g.addV(id, "1").
  addV(id, "2").
  addV(id, "2.1").
  addV(id, "1.1").
  addV(id, "1.2").
  addV(id, "1.1.1")

...the following query worked pretty well:

gremlin> g.V().as("a").V().as("b").filter(
             select("a","b").by(id).
               where("b",gt("a")).
               filter {
                 ids = it.get()
                 ids["a"].length() < ids["b"].length() &&
                 ids["b"].substring(0, ids["a"].length()).equals(ids["a"]) &&
                 ids["b"].substring(ids["a"].length()) ==~ /\.[^.]*/
               }).addE("link").from("a").to("b")
==>e[0][1-link->1.1]
==>e[1][1-link->1.2]
==>e[2][2-link->2.1]
==>e[3][1.1-link->1.1.1]


Of course you can just use the level property instead of the id. And btw., I don't think it can be done without lambdas.

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/cf28c523-fe1c-4609-be37-5e1d3a4a1f51%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Adhitya Rajagopalan

unread,
Jul 25, 2016, 5:09:58 AM7/25/16
to Aurelius
This is the code that I have right now:
conf = new BaseConfiguration()

conf.setProperty("gremlin.graph","com.thinkaurelius.titan.core.TitanFactory")
conf.setProperty("storage.backend", "cassandra")
conf.setProperty("storage.hostname", "127.0.0.1")

conf.setProperty("storage.batch-loading", true)
conf.setProperty("storage.cassandra.keyspace", "graph")
graph = GraphFactory.open(conf)
g = graph.traversal()
m = graph.openManagement()
PropertyKey pk1= m.makePropertyKey("pk1").dataType(String.class).make();
PropertyKey pk2= m.makePropertyKey("pk2").dataType(String.class).make();
PropertyKey pk3= m.makePropertyKey("pk3").dataType(String.class).make();
PropertyKey pk4= m.makePropertyKey("pk4").dataType(String.class).make();
PropertyKey pk5= m.makePropertyKey("pk5").dataType(String.class).make();
PropertyKey CFGACH_LEVEL_CODE= m.makePropertyKey("CFGACH_LEVEL_CODE").dataType(String.class).make();
PropertyKey pk7= m.makePropertyKey("pk7").dataType(String.class).make();
PropertyKey pk8= m.makePropertyKey("pk8").dataType(String.class).make();
PropertyKey pk9= m.makePropertyKey("pk9").dataType(String.class).make();
PropertyKey pk10= m.makePropertyKey("pk10").dataType(String.class).make();
PropertyKey pk11= m.makePropertyKey("pk11").dataType(String.class).make();
m.commit()
batchSize = 10000
counter = new java.util.concurrent.atomic.AtomicLong()
cache = [:]
mutate = { ->
  if (0 == counter.incrementAndGet() % batchSize) {
    graph.tx().commit()
  }
}
new File("C:/Users/PJT755/data.csv").eachLine { def line ->
  def (id1, id2, id3, id4, id5, id6, id7, id8, id9, id10, id11) = line.split(",", 11)
v = graph.addVertex("CFGACH_LEVEL_CODE", id6,"pk1", id1, "pk2", id2, "pk3", id3,"pk4",id4, "pk5",id5, "pk7", id7, "pk8", id8, "pk9", id9, "pk10", id10, "pk11", id11)
   mutate()  
};
graph.tx().commit();

g.V().as("a").V().as("b").filter(
             select("a","b").by(CFGACH_LEVEL_CODE).
               where("b",gt("a")).
               filter {
                 ids = it.get()
                 ids["a"].length() < ids["b"].length() &&
                 ids["b"].substring(0, ids["a"].length()).equals(ids["a"]) &&
                 ids["b"].substring(ids["a"].length()) ==~ /\.[^.]*/
               }).addE("link").from("a").to("b")

I am using Titan 1.0.0 with storage backend as Cassandra.

Adhitya Rajagopalan

unread,
Jul 25, 2016, 5:09:58 AM7/25/16
to Aurelius
Thanks a lot for your reply. When I run the code that you had posted, I get the following error:

No signature of method: org.apache.tinkerpop.gremlin.process.traversal.dsl.graph

.DefaultGraphTraversal.V() is applicable for argument types: () values: []

Possible solutions: is(java.lang.Object), by(groovy.lang.Closure), any(), min(),max(), sum()

Display stack trace? [yN]


How do I overcome this problem?


On Friday, July 22, 2016 at 12:07:58 PM UTC+5:30, Adhitya Rajagopalan wrote:

Daniel Kuppitz

unread,
Jul 25, 2016, 5:15:07 AM7/25/16
to aureliu...@googlegroups.com
Ah, I always run into this trap.  Titan doesn't support mid-traversal V()'s (yet). You can try this instead 

g.V().as("a").aggregate("x").select("x").unfold().as("b").filter(
  ...

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.

Adhitya Rajagopalan

unread,
Jul 25, 2016, 5:46:22 AM7/25/16
to Aurelius
Even after changing the code to:
g.V().as("a").aggregate("x").select("x").unfold().as("b").filter(
             select("a","b").by("CFGACH_LEVEL_CODE").
               where("b",gt("a")).
               filter {
                 ids = it.get()
                 ids["a"].length() < ids["b"].length() &&
                 ids["b"].substring(0, ids["a"].length()).equals(ids["a"]) &&
                 ids["b"].substring(ids["a"].length()) ==~ /\.[^.]*/
               }).addE("link").from("a").to("b")

I get the same error. Should the property key be in quotes?

Adhitya Rajagopalan

unread,
Jul 25, 2016, 5:46:22 AM7/25/16
to Aurelius
Sorry. This is the new error I got

No signature of method: org.apache.tinkerpop.gremlin.process.traversal.dsl.graph

.DefaultGraphTraversal.addE() is applicable for argument types: (java.lang.Strin

g) values: [link] 


On Monday, July 25, 2016 at 2:45:07 PM UTC+5:30, Daniel Kuppitz wrote:

Daniel Kuppitz

unread,
Jul 25, 2016, 6:00:43 AM7/25/16
to aureliu...@googlegroups.com
This should work in 3.0.x:

.addOutE("a", "link", "b")

...instead of  .addE("link").from("a").to("b") (just in case it wasn't obvious).


Cheers,
Daniel


f201...@dubai.bits-pilani.ac.in

unread,
Jul 25, 2016, 6:35:51 AM7/25/16
to Aurelius
hello Daniel
how would the code change if the edges have to be created based on multiple vertex properties?
 Also pls explain what this code does (right of the ==):
ids["b"].substring(ids["a"].length()) ==~ /\.[^.]*/

Yours
Shaswat

Daniel Kuppitz

unread,
Jul 25, 2016, 8:26:05 AM7/25/16
to aureliu...@googlegroups.com
how would the code change if the edges have to be created based on multiple vertex properties?

You mean for the by() part? You can use whatever you want (e.g. values("prop1","prop2") or valueMap("prop1","prop2"), whatever works best for you.

Also pls explain what this code does (right of the ==):

It cuts off the the first property value from the second and then ensures that the remainder starts with a period, but then contains no more periods.

For example:

First value:             AAA.BBB
Second value:            AAA.BBB.CCC
Remainder after the cut:        .CCC
Matches '\.[^.]*':       Yes

First value:              AAA.BBB
Second value:             AAA.BBB.CCC.DDD
Remainder after the cut:         .CCC.DDD
Matches '\.[^.]*':        No

Cheers,
Daniel



SHASWAT ANANDHAN

unread,
Jul 26, 2016, 5:20:04 AM7/26/16
to Aurelius
HI Daniel,
               Thank you for your explanation.
 By multiple parameters I mean the following:
  In addition to the level_code, if I have another vertex property key, say prop_key, edges should be established only if 'a.prop_key=b.prop_key' and also the above level_code hierarchy.

eg.   Suppose we have,
        Vertex1={(prop_key=2),......,(level_code=1.1),..}
        Vertex2={(prop_key=3),......,(level_code=1.1),..}
        Vertex3={(prop_key=2),......,(level_code=1.1.1),..}
        Vertex4={(prop_key=2),......,(level_code=1.1.2),..}
        Vertex5={(prop_key=2),......,(level_code=1.1.3),..}
        Vertex6={(prop_key=3),......,(level_code=1.1),..}




On Friday, July 22, 2016 at 12:07:58 PM UTC+5:30, Adhitya Rajagopalan wrote:

SHASWAT ANANDHAN

unread,
Jul 26, 2016, 5:20:04 AM7/26/16
to Aurelius
In continuation to the previous reply, for the vertex set in the example,
            edges must be created between 
            {Vertex1,Vertex3} and {Vertex1,Vertex4} and {Vertex1,Vertex5}..... 

There should be NO vertex between {Vertex2,Vertex3} etc.

On Friday, July 22, 2016 at 12:07:58 PM UTC+5:30, Adhitya Rajagopalan wrote:

Daniel Kuppitz

unread,
Jul 26, 2016, 5:36:00 AM7/26/16
to aureliu...@googlegroups.com
You can just use another filter step:

g.V().as("a").aggregate("x").select("x").unfold().as("b").
  filter(select("a","b").by("prop_key").where("a", eq("b"))).
  filter(select("a","b").by("level_code").
           where("b",gt("a")).
           filter {
             ids = it.get()
             ids["a"].length() < ids["b"].length() &&
             ids["b"].substring(0, ids["a"].length()).equals(ids["a"]) &&
             ids["b"].substring(ids["a"].length()) ==~ /\.[^.]*/
           }).
  addOutE("a", "link", "b")

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.

SHASWAT ANANDHAN

unread,
Jul 26, 2016, 8:26:42 AM7/26/16
to Aurelius
When I use this method, exactly 101 edges get created. I am sure that more than 10K edges should be created. 

Daniel Kuppitz

unread,
Jul 26, 2016, 8:29:48 AM7/26/16
to aureliu...@googlegroups.com
Can you show the properties of 2 vertices that were not properly linked?

Cheers,
Daniel


SHASWAT ANANDHAN

unread,
Jul 29, 2016, 3:17:54 AM7/29/16
to Aurelius
Dear Daniel,
                  These are the properties of the 2 vertices between which a edge should have been created!. But I find that no edge has been created between these 2 vertices. 
gremlin> g.V(41087024).valueMap()
==>[CDI:[NULL], CPC:[AGB], CAS:[A], CNI:[ENG-208], CPO:[8], CR:[0], CDP:[NULL], CCI:[ENG-218], PART_NO:[N515A], CPS:[A], CSN:[N515A-3019], CPPN:[340-046-60-20], CN:[340-046-50-40], CNN:[0], CAN:[7200], CCC:[%], CFD:[2012-02-07 00:00:00.000], CZ:[210], CAO:[8], CPT:[O], COE:[8], CFGACH_LEVEL_CODE:[1.46.3.1], CLN:[3], CARF:[N], CRMF:[Y]]


gremlin> g.V(83017848).valueMap()
==>[CDI:[NULL], CPC:[AGBA], CAS:[A],CNI:[ENG-206], CPO:[8], CR:[0], CDP:[NULL], CCI:[ENG-208], PART_NO:[N515A], CPS:[A], CSN:[N515A-2020], CPPN:[340-046-50-40], CN:[CFM56-5B], CNN:[0], CAN:[7200], CCC:[%], CFD:[2012-02-07 00:00:00.000], CZ:[210], CAO:[8], CPT:[O], COE:[8],CFGACH_LEVEL_CODE:[1.46.3], CLN:[2], CARF:[N], CRMF:[N]]
  
Cheers
Shaswat

Daniel Kuppitz

unread,
Jul 29, 2016, 3:31:12 AM7/29/16
to aureliu...@googlegroups.com
Hmm, that's odd, the clearly match the pattern. And you're sure that the transaction didn't fail? Maybe just do a count and see how many matches you get. If the number is higher than the number of edges that were created, we need to think about another strategy. Maybe write the id pairs into a file and then process them afterwards and frequently commit after X new edges.

The count-query:

g.V().as("a").aggregate("x").select("x").unfold().as("b").
  filter(select("a","b").by("PART_NO").where("a", eq("b"))).
  filter(select("a","b").by("CFGACH_LEVEL_CODE").
           where("b", gt("a")).
           filter {
             ids = it.get()
             ids["a"].length() < ids["b"].length() &&
             ids["b"].substring(0, ids["a"].length()).equals(ids["a"]) &&
             ids["b"].substring(ids["a"].length()) ==~ /\.[^.]*/
           }).count()

Cheers,
Daniel


Reply all
Reply to author
Forward
0 new messages