Most efficient way to create hundreds Vertices in one query with different properties for each vertex

96 views
Skip to first unread message

Huimin Yang

unread,
May 11, 2019, 12:55:13 AM5/11/19
to Gremlin-users
i'm trying to create hundreds of vertices with different properties, for example there's a Java class named 'person', with class attributes: 'name', 'age', 'sex'. i want to create 100 vertex with label 'person' and each of them has different properties ('name', 'age', 'sex'). some of the property could be null, eg: person John only has two properties: name: 'John', age: '20'

I tried using ObjectMapper to Map the 100 'person' instances as Objects, and put these objects in a list and use inject:
g.inject(list).unfold().as("a").addV("person").choose(__select("a").select("name"), __.property("name", __.select("a").select("name")))

it actually works but the choose step is quite expensive in our case so it's better to avoid it.

I'm thinking is there a better solution to create large size of vertices with different properties? Or is there a way without using inject and i could handle the null properties in Java?

Benjamin Ross

unread,
May 11, 2019, 2:30:11 AM5/11/19
to gremli...@googlegroups.com
Why not generate a single giant query string by iterating through all 100 of your objects and appending to a string and building up like this:

String generateQuery()
{
    StringBuilder query;

    query.append(g);

    for (person p : persons) {
        query.append(".addV('person')");

        if (p.name != null) {
            query.append(".property('name'," + p.name + ")";
        }

        //... And so on 
    }

    return query.toString();
}

g.addV('person').property('name','bob')...

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/63f362e6-1c04-45a8-a157-bff03ae280d7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Huimin Yang

unread,
May 11, 2019, 10:47:25 AM5/11/19
to Gremlin-users
Thanks Benjamin, but how can I turn this string to a query and send it to Gremlin server then?

On Saturday, May 11, 2019 at 2:30:11 AM UTC-4, gremlin-grp wrote:
Why not generate a single giant query string by iterating through all 100 of your objects and appending to a string and building up like this:

String generateQuery()
{
    StringBuilder query;

    query.append(g);

    for (person p : persons) {
        query.append(".addV('person')");

        if (p.name != null) {
            query.append(".property('name'," + p.name + ")";
        }

        //... And so on 
    }

    return query.toString();
}

g.addV('person').property('name','bob')...
On Sat, May 11, 2019 at 12:55 AM Huimin Yang <hya...@ncsu.edu> wrote:
i'm trying to create hundreds of vertices with different properties, for example there's a Java class named 'person', with class attributes: 'name', 'age', 'sex'. i want to create 100 vertex with label 'person' and each of them has different properties ('name', 'age', 'sex'). some of the property could be null, eg: person John only has two properties: name: 'John', age: '20'

I tried using ObjectMapper to Map the 100 'person' instances as Objects, and put these objects in a list and use inject:
g.inject(list).unfold().as("a").addV("person").choose(__select("a").select("name"), __.property("name", __.select("a").select("name")))

it actually works but the choose step is quite expensive in our case so it's better to avoid it.

I'm thinking is there a better solution to create large size of vertices with different properties? Or is there a way without using inject and i could handle the null properties in Java?

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremli...@googlegroups.com.

Benjamin Ross

unread,
May 11, 2019, 11:08:30 AM5/11/19
to gremli...@googlegroups.com
It depends what database you're using and how you're connecting to it. Usually there is a web socket open and you can send the string via that connection.

To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/4cb2ab90-5963-4651-9dbc-015746e20bcd%40googlegroups.com.

Huimin Yang

unread,
May 12, 2019, 2:33:04 AM5/12/19
to Gremlin-users
hmm.. I saw on the official doc that the inject way is the recommended way though http://tinkerpop.apache.org/docs/current/recipes/#long-traversals,

 however is there a way to deal with the situation when some people don't have some fields? eg: no 'age' or no 'name'

Daniel Kuppitz

unread,
May 12, 2019, 9:59:22 AM5/12/19
to gremli...@googlegroups.com
Hi Huimin,

just like you already specified the values in property() dynamically, you can also specify the keys dynamically:

gremlin> g = TinkerGraph.open().traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> 
gremlin> data = [["name": "Huimin Yang"],
                 ["name": "Daniel Kuppitz", "age": 37]]
==>[name:Huimin Yang]
==>[name:Daniel Kuppitz,age:37]
gremlin> 
gremlin> g.inject(data).unfold().as("m").
           addV("person").as("v").
           select("m").unfold().as("kv").
           select("v").
             property(select("kv").by(keys), select("kv").by(values)).iterate()
gremlin> 
gremlin> g.V().valueMap()
==>[name:[Huimin Yang]]
==>[name:[Daniel Kuppitz],age:[37]]

Cheers,
Daniel


To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/1a7761bb-2475-41d2-943a-63bd05016aa6%40googlegroups.com.

HadoopMarc

unread,
May 12, 2019, 11:36:57 AM5/12/19
to Gremlin-users
Wonderful, deserves a place in the recipes pattern section, referenced by Huimin. @Huimin, does Daniel's solution indeed perform better than your original solution based on the choose() step, when applied on a significant list of persons?

Best regards,    Marc

Op zondag 12 mei 2019 15:59:22 UTC+2 schreef Daniel Kuppitz:

Huimin Yang

unread,
May 13, 2019, 2:09:47 AM5/13/19
to Gremlin-users
Hi Daniel, Marc

Thanks for the solution. I'm using Neptune and I did the testing to compare the approach of using choose and the one provided by @Daniel.

The difference is not super big but indeed @Daniel's solution is faster.

For 500 vertices on average is 50 ms faster, for 2000 vertices on average 300 ms faster, for 5000 vertices on average 500 ms faster. And i guess the difference should be more obvious if there's more null fields in the object injected, using choose would obviously be more slower that case.

Thanks again for the suggestion!

Pavel Ershov

unread,
May 16, 2019, 5:30:33 PM5/16/19
to Gremlin-users
Hi

Is it possible to convert this query to upsert form like this https://stackoverflow.com/a/49758568

HasContainer accept only String key, therefore cant use dynamic keys

Daniel Kuppitz

unread,
May 16, 2019, 7:46:27 PM5/16/19
to gremli...@googlegroups.com
It's a bit more complicated if you want to make sure that index lookups are being used.

Let's start with a method that prepares the input data and executes the upsert traversal:

process = { g, persons ->
  data = persons.collectEntries {[it.name, it]}
  g.V().has("person", "name", within(data.keySet())).as("v").
    constant(data).select(select("v").by("name")).
    store("x").
      by(select("name")).
    unfold().filter(select(keys).is(neq("name"))).as("kv").
    select("v").
      property(select("kv").by(keys), select("kv").by(values)).
    count().    /* simplest way to reduce everything to a single traverser without wasting memory and CPU resources */
    constant(data.values()).unfold().
    filter(select("name").where(without("x"))).as("props").
    addV("person").as("v").
    select("props").unfold().as("kv").
    select("v").
      property(select("kv").by(keys), select("kv").by(values)).iterate()
}

Now let's do two iterations. The first one creates 2 person vertices, the second one leaves 1 person untouched, modifies the second and adds a third.

Iteration #1

gremlin> g = TinkerGraph.open().traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> data = [["name": "Huimin Yang"],
                 ["name": "Daniel Kuppitz", "age": 36]]; []

gremlin> process(g, data)
gremlin> g.V().valueMap()
==>[name:[Huimin Yang]]
==>[name:[Daniel Kuppitz],age:[36]]

Iteration #2

gremlin> data = [["name": "Huimin Yang"],
                 ["name": "Daniel Kuppitz", "age": 37],
                 ["name": "Pavel Ershov"]]; []

gremlin> process(g, data)
gremlin> g.V().valueMap()
==>[name:[Huimin Yang]]
==>[name:[Daniel Kuppitz],age:[37]]
==>[name:[Pavel Ershov]]

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages