Insert many nodes and ignore if already existing

78 views
Skip to first unread message

Edoardo Barp

unread,
Jul 7, 2021, 3:06:50 PM7/7/21
to Gremlin-users

<p>Hello everyone,</p>
<p>I&#39;m having trouble doing an insert many in which I ignore the insert step if the node already exists.
Currently, my solution which doesn&#39;t check if the node exists and instead always inserts looks like this.</p>
<pre><code class="lang-python"><span class="hljs-meta"># suppose objects is a list of dictionaries of shape [{<span class="hljs-string">"foo"</span>: <span class="hljs-string">"value"</span>, <span class="hljs-string">"bar"</span>: <span class="hljs-string">"value"</span>, ..}, ..]</span>

g.inject(objects)
    .unfold().as_(<span class="hljs-string">"objects"</span>)
    <span class="hljs-meta"># insert foo </span>
    .addV(<span class="hljs-string">"foo"</span>).as_(<span class="hljs-string">"f_vertex"</span>)
    .sideEffect(
        <span class="hljs-keyword">select</span>(<span class="hljs-string">"objects"</span>)
        .unfold().as_(<span class="hljs-string">"kv"</span>)
        .<span class="hljs-keyword">select</span>(<span class="hljs-string">"f_vertex"</span>).property(<span class="hljs-keyword">select</span>(<span class="hljs-string">'kv'</span>).by(Column.keys), <span class="hljs-keyword">select</span>(<span class="hljs-string">'kv'</span>).by(Column.values))
    )
    .addV(<span class="hljs-string">"bar"</span>).as_(<span class="hljs-string">"b_vertex"</span>)
    .sideEffect(
        <span class="hljs-keyword">select</span>(<span class="hljs-string">"objects"</span>).<span class="hljs-keyword">select</span>(<span class="hljs-string">"foo"</span>).as_(<span class="hljs-string">"foos"</span>)
        .<span class="hljs-keyword">select</span>(<span class="hljs-string">"b_vertex"</span>).property(<span class="hljs-string">"value"</span>, <span class="hljs-keyword">select</span>(<span class="hljs-string">"foos"</span>).unfold())
        .addE(<span class="hljs-string">"test"</span>).from_(<span class="hljs-keyword">select</span>(<span class="hljs-string">"p_vertex"</span>)).<span class="hljs-keyword">to</span>(<span class="hljs-keyword">select</span>(<span class="hljs-string">"b_vertex"</span>))
    ).<span class="hljs-keyword">next</span>()
</code></pre>
<p>This works flawlessly. I&#39;m now trying to make it so that the <code>bar</code> insert is conditional on
the non-existence of the node. My current attempt looks like this:</p>
<pre><code class="lang-python">g.inject(objects)
    .unfold().as_(<span class="hljs-string">"objects"</span>)
    .addV(<span class="hljs-string">"foo"</span>).as_(<span class="hljs-string">"f_vertex"</span>)
    .sideEffect(
        <span class="hljs-keyword">select</span>(<span class="hljs-string">"objects"</span>)
        .unfold().as_(<span class="hljs-string">"kv"</span>)
        .<span class="hljs-keyword">select</span>(<span class="hljs-string">"f_vertex"</span>).<span class="hljs-keyword">property</span>(<span class="hljs-keyword">select</span>(<span class="hljs-symbol">'kv</span>').by(Column.keys), <span class="hljs-keyword">select</span>(<span class="hljs-symbol">'kv</span>').by(Column.values))
    )
    .sideEffect(
        <span class="hljs-keyword">select</span>(<span class="hljs-string">"objects"</span>).<span class="hljs-keyword">select</span>(<span class="hljs-string">"bar"</span>).unfold().as_(<span class="hljs-string">"bars"</span>)
        .V().has(<span class="hljs-string">"bar"</span>, <span class="hljs-string">"value"</span>, <span class="hljs-keyword">select</span>(<span class="hljs-string">"bars"</span>).unfold()).coalesce(
            unfold(),
            addV(<span class="hljs-string">"bar"</span>).<span class="hljs-keyword">property</span>(<span class="hljs-string">"value"</span>, <span class="hljs-keyword">select</span>(<span class="hljs-string">"bars"</span>).unfold())
        )
    ).<span class="hljs-keyword">next</span>()
</code></pre>
<p>But then no <code>bar</code> nodes are added at all.
I&#39;ve tried several other similar combinations (pulling select outside of the sideEffect,
playing with <code>fold/unfold</code>), but I can&#39;t seem to get it right. </p>
<p>I would be very grateful if someone could explain to me what&#39;s wrong and direct me to the!
Best</p>

Edoardo Barp

unread,
Jul 7, 2021, 5:28:06 PM7/7/21
to Gremlin-users

Very sorry about the formatting, I thought html would be interpreted. Here is the "plain text" version:

Hello everyone,

I'm having trouble doing an insert many in which I ignore the insert step if the node already exists. Currently, my solution which doesn't check if the node exists and instead always inserts looks like this.

# suppose objects is a list of dictionaries of shape [{"foo": "value", "bar": "value", ..}, ..]
g.inject(objects) .unfold().as_("objects") 
    .addV("foo").as_("f_vertex") 
    .sideEffect(
         select("objects") .unfold().as_("kv")
        .select("f_vertex").property(select('kv').by(Column.keys), select('kv').by(Column.values))
)
.addV("bar").as_("b_vertex") 
.sideEffect( 
    select("objects").select("foo").as_("foos")
    .select("b_vertex").property("value", select("foos").unfold()) 
    .addE("test").from_(select("p_vertex")).to(select("b_vertex")) 
).next()

This works flawlessly. I'm now trying to make it so that the bar insert is conditional on the non-existence of the node. My current attempt looks like this:

g.inject(objects) .unfold().as_("objects") 
    .addV("foo").as_("f_vertex") 
    .sideEffect(
        select("objects") .unfold().as_("kv")
        .select("f_vertex").property(select('kv').by(Column.keys), select('kv').by(Column.values))
)
.sideEffect( 
    select("objects").select("bar").unfold().as_("bars") 
    .V().has("bar", "value", select("bars").unfold()
).coalesce( 
    unfold(),
    addV("bar").property("value", select("bars").unfold()) )
).next()

But then no bar nodes are added at all. I've tried several other similar combinations (pulling select outside of the sideEffect, playing with fold/unfold), but I can't seem to get it right.

I would be very grateful if someone could explain to me what's wrong and direct me to the! 

Best

Tanner Sorensen

unread,
Jul 8, 2021, 10:49:00 AM7/8/21
to Gremlin-users
I think the pattern you are looking for is the following:

g.V().has("person", "name", "marko").as("v0").fold().coalesce(unfold(), addV("bar")).iterate()

Example in Gremlin Console with the Modern graph:

gremlin> graph = TinkerFactory.createModern()
==>tinkergraph[vertices:6 edges:6]
gremlin> g = traversal().withEmbedded(graph)
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().has("person", "name", "marko").as("v0").fold().coalesce(unfold(), addV("bar")).iterate()
gremlin> g.V()
==>v[1]
==>v[2]
==>v[3]
==>v[4]
==>v[5]
==>v[6]
gremlin> g.V().has("person", "name", "janice").as("v0").fold().coalesce(unfold(), addV("bar")).iterate()
gremlin> g.V()
==>v[1]
==>v[2]
==>v[3]
==>v[4]
==>v[5]
==>v[6]
==>v[13]
gremlin> g.V(13).label()
==>bar

Edoardo Barp

unread,
Jul 8, 2021, 12:42:59 PM7/8/21
to Gremlin-users
Thanks for the suggestion, that does work when we are doing a batch get-or-insert with a single property value ("name":"marko" in your example). However, what happens if there are many "person" nodes?
For example:

# objects = [{"name":"marko"}, {"name": "polo"}]
graph.inject(objects).as("objects")
.V().has("person", "name", select("objects").select("name")).as("v0").fold()
.coalesce(
  unfold(),
  addV("person").property("name", select("objects").select("name"))
).iterate()

This doesn't work, how would you correct it?

Edoardo Barp

unread,
Jul 9, 2021, 5:24:00 AM7/9/21
to Gremlin-users
Turns out it's simpler than it looks. The check for vertex should be inside the coalesce. Snippet that works:

objects = [{"name": "marko"}, {"name":"polo"}]

g.inject(objects).unfold()
.coalesce(
   V().has("person").has("name", select("name")),
   addV("person").property("name", select("name"))
)

Reply all
Reply to author
Forward
0 new messages