Adding vertices with automatically generated IDs

80 views
Skip to first unread message

stell...@gmail.com

unread,
Sep 24, 2019, 10:37:28 AM9/24/19
to JanusGraph users

I used JanusGraph with Berkeley DB and found that adding even 10 vertices takes 6 seconds on my laptop.

I narrowed down the problem to a simple program which use JanusGraph with inmemory backend storage.

Profiler showed that the code 'g.addV(LABEL).property(KEY, value).next()' takes 87% of time in main thread
and both methods   StandardJanusGraphTx.addProperty() and StandardJanusGraphTx.addVertex() spends most time in
StandardJanusGraph.assignID() which just waits for StandardIDPool.waitForIDBlockGetter().

I updated my program to use custom generated IDs `g.addV(LABEL).property(T.id, id).property(KEY, value).next()'
and the adding vertices took about 1 second and about 59% from the main thread time.

However, the second approach requires custom id management.

Is it possible to use automatically generated IDs so time to adding vertices does not take too much time?

I used JanusGraph version 0.4.0.

Thanks,
Alexander.

The code used for testing:
-----------  JanusGraphSample.java -----------------
package sample;

import org.apache.commons.io.FileUtils;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.T;
import org.janusgraph.core.JanusGraph;
import org.janusgraph.core.JanusGraphFactory;
import org.janusgraph.core.JanusGraphTransaction;
import org.janusgraph.graphdb.database.StandardJanusGraph;
import org.janusgraph.graphdb.idmanagement.IDManager;

import java.io.File;
import java.io.IOException;

public class JanusGraphSample {

    private static final String LABEL = "SampleLabel";
    private static final String KEY = "SampleKey";
    private static final String VALUE = "SampleValue";
    private static final int N = 10;

    public static void main(String[] args) {
        addVerticesWithGeneratedIds();
        addVerticesWithCustomIds();
    }

    private static JanusGraph getJanusGraph(boolean customIds) {

        JanusGraphFactory.Builder builder = JanusGraphFactory.build()
                .set("storage.backend", "inmemory");

        if (customIds) {
            builder = builder.set("graph.set-vertex-id", "true");
        }

        return builder.open();
    }

    static void clearGraph(JanusGraph graph) {
        try (JanusGraphTransaction tx = graph.newTransaction()) {
            GraphTraversalSource g = tx.traversal();
            g.V().drop().iterate();
            tx.commit();
        }
    }

    static long verticesCount(JanusGraph graph) {
        try (JanusGraphTransaction tx = graph.newTransaction()) {
            GraphTraversalSource g = tx.traversal();
            return g.V().count().next();
        }
    }

    public static void addVerticesWithGeneratedIds() {

        System.out.printf("Add vertices with generated ids%n");

        try (JanusGraph graph = getJanusGraph(false)) {

            clearGraph(graph);

            try (JanusGraphTransaction tx = graph.newTransaction()) {
                GraphTraversalSource g = tx.traversal();

                System.out.printf("N: %d%n", N);
                long time = System.currentTimeMillis();
                for (int i = 1; i <= N; i++) {
                    String value = String.format("%s-%d", VALUE, i);
                    g.addV(LABEL).property(KEY, value).next();
                }
                tx.commit();
                System.out.printf("elapsed time: %d(ms)%n", System.currentTimeMillis() - time);
            }

            long vertices = verticesCount(graph);
            System.out.printf("added vertices: %d%n", vertices);
        }
    }

    public static void addVerticesWithCustomIds() {

        System.out.printf("Add vertices with custom ids%n");

        try (JanusGraph graph = getJanusGraph(true)) {

            clearGraph(graph);

            IDManager idManager = ((StandardJanusGraph) graph).getIDManager();

            try (JanusGraphTransaction tx = graph.newTransaction()) {
                GraphTraversalSource g = tx.traversal();

                System.out.printf("N: %d%n", N);
                long time = System.currentTimeMillis();
                for (int i = 1; i <= N; i++) {
                    long id = idManager.toVertexId(i);
                    String value = String.format("%s-%d", VALUE, i);
                    g.addV(LABEL).property(T.id, id).property(KEY, value).next();

                }
                tx.commit();
                System.out.printf("elapsed time: %d(ms)%n", System.currentTimeMillis() - time);
            }

            long vertices = verticesCount(graph);
            System.out.printf("added vertices: %d%n", vertices);
        }
    }
}
--------------------------

Pavel Ershov

unread,
Sep 24, 2019, 1:02:50 PM9/24/19
to JanusGraph users
All entities needs assigned id: edges, vertices, properties and same for schema entities too. But this pause occurred not so frequently for each 10 000 ids by default
You can trace pool size and gather frequency here

I try speed up block acquisition for local storages here, it may increase graph initialization when schema is too big. With current fix both tests executed for 200ms

вторник, 24 сентября 2019 г., 17:37:28 UTC+3 пользователь stell...@gmail.com написал:

Alexander Scherbatiy

unread,
Sep 26, 2019, 11:08:36 AM9/26/19
to janusgra...@googlegroups.com, Pavel Ershov
I tried to use custom ids for created vertices setting
graph.set-vertex-id option.

Now the most time is consumed by code that creates indices and calls
ManagementSystem.getOrCreateVertexLabel() (it takes about 20% of total
time for 1000 vertices and 3000 edges creation).

StandardJanusGraphTx.makeSchemaVertex() calls assignID() through
addProperty() and waits for waitForIDBlockGetter() 565ms total time
and it calls assignID() directly which waits for waitForIDBlockGetter()
334ms total time on my laptop.

Does it mean that each type vertex and property waits for the id separately?

Is there a way to provide custom ids for vertices and properties during
index creation (e.g for ManagementSystem.getOrCreateVertexLabel() call)?

Thanks,
Alexander.

On 24.09.2019 20:02, Pavel Ershov wrote:
> All entities needs assigned id: edges, vertices, properties and same for
> schema entities too. But this pause occurred not so frequently for each
> 10 000 ids by default
> You can trace pool size and gather frequency here
> <https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/database/idassigner/StandardIDPool.java#L191>
> That's pool may be increased for bulk loading, see
> https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/configuration/GraphDatabaseConfiguration.java#L694
>
> I try speed up block acquisition for local storages here
> <https://github.com/mad/janusgraph/commit/342a1814b26a8b63f9337f88145c0dcb8f628a09>,
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "JanusGraph users" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/janusgraph-users/ME9U6-0Xn1Y/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> janusgraph-use...@googlegroups.com
> <mailto:janusgraph-use...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/janusgraph-users/aa5daeac-ae57-4a30-ad14-91bd0e3a31b4%40googlegroups.com
> <https://groups.google.com/d/msgid/janusgraph-users/aa5daeac-ae57-4a30-ad14-91bd0e3a31b4%40googlegroups.com?utm_medium=email&utm_source=footer>.

akhilesh singh

unread,
Oct 1, 2019, 4:35:05 AM10/1/19
to janusgra...@googlegroups.com
can you try below 2 properties
ids.block-size=1000000000
ids.renew-percentage=0.3

You received this message because you are subscribed to the Google Groups "JanusGraph users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to janusgraph-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/janusgraph-users/51caccd9-8364-cb9f-0610-ee4ae11bbd73%40gmail.com.


--
Thanks
Akhilesh

Alexander Scherbatiy

unread,
Oct 1, 2019, 10:37:33 AM10/1/19
to janusgra...@googlegroups.com, akhilesh singh
I tried these 2 properties and found that consequence in-memory
JanusGraph creation depends on the previous one.

The JanusGraph was created with options:

JanusGraphFactory.Builder builder = JanusGraphFactory.build()
.set("storage.backend", "inmemory")
.set("ids.authority.wait-time", "5")
.set("ids.renew-timeout", "50")
.set("ids.block-size", "1000000000")
.set("ids.renew-percentage", "0.3");

if (customIds) {
builder = builder.set("graph.set-vertex-id", "true");
}

If I first run creating vertices with generated ids and then custom ids
I got:
[generated ids] vertices: 1000, elapsed time: 1255(ms), graph
creation time: 593
[custom ids] vertices: 1000, elapsed time: 255(ms), graph creation
time: 4

The method with generated ids takes about 1 sec and the graph creation
takes about 600ms.

If I first run creating vertices with custom ids I got:
[custom ids] vertices: 1000, elapsed time: 1125(ms), graph creation
time: 590
[generated ids] vertices: 1000, elapsed time: 401(ms), graph creation
time: 3

The method with custom ids takes about 1sec and graph creation takes
about 600ms.

If I run only one of them then each takes about 1 sec and graph creation
takes about 600ms.

Here is the full code that I used:
----------- JanusGraphSample.java -----------------
import
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.T;
import org.janusgraph.core.JanusGraph;
import org.janusgraph.core.JanusGraphFactory;
import org.janusgraph.core.JanusGraphTransaction;
import org.janusgraph.graphdb.database.StandardJanusGraph;
import org.janusgraph.graphdb.idmanagement.IDManager;

public class JanusGraphSample {

private static final String LABEL = "SampleLabel";
private static final String KEY = "SampleKey";
private static final String VALUE = "SampleValue";
private static final int N = 1000;

public static void main(String[] args) {
addVerticesWithGeneratedIds();
addVerticesWithCustomIds();
}

private static JanusGraph getJanusGraph(boolean customIds) {

JanusGraphFactory.Builder builder = JanusGraphFactory.build()
.set("storage.backend", "inmemory")
.set("ids.authority.wait-time", "5")
.set("ids.renew-timeout", "50")
.set("ids.block-size", "1000000000")
.set("ids.renew-percentage", "0.3");

if (customIds) {
builder = builder.set("graph.set-vertex-id", "true");
}

return builder.open();
}

static long verticesCount(JanusGraph graph) {
try (JanusGraphTransaction tx = graph.newTransaction()) {
GraphTraversalSource g = tx.traversal();
return g.V().count().next();
}
}

public static void addVerticesWithGeneratedIds() {

long time = System.currentTimeMillis();
try (JanusGraph graph = getJanusGraph(false)) {
long creationTime = System.currentTimeMillis();

try (JanusGraphTransaction tx = graph.newTransaction()) {
GraphTraversalSource g = tx.traversal();

for (int i = 1; i <= N; i++) {
String value = String.format("%s-%d", VALUE, i);
g.addV(LABEL).property(KEY, value).next();
}
tx.commit();
}
long elapsedTime = System.currentTimeMillis();
long vertices = verticesCount(graph);
System.out.printf("[generated ids] vertices: %d, elapsed
time: %d(ms), graph creation time: %d%n", vertices, elapsedTime - time,
creationTime - time);
}
}

public static void addVerticesWithCustomIds() {

long time = System.currentTimeMillis();

try (JanusGraph graph = getJanusGraph(true)) {
long creationTime = System.currentTimeMillis();

IDManager idManager = ((StandardJanusGraph)
graph).getIDManager();

try (JanusGraphTransaction tx = graph.newTransaction()) {
GraphTraversalSource g = tx.traversal();

for (int i = 1; i <= N; i++) {
long id = idManager.toVertexId(i);
String value = String.format("%s-%d", VALUE, i);
g.addV(LABEL).property(T.id, id).property(KEY,
value).next();

}
tx.commit();
}

long elapsedTime = System.currentTimeMillis();
long vertices = verticesCount(graph);
System.out.printf("[custom ids] vertices: %d, elapsed time:
%d(ms), graph creation time: %d%n", vertices, elapsedTime - time,
creationTime - time);
}
}
}
----------------------------------------------------

Thanks,
Alexander.
> > stell...@gmail.com <mailto:stell...@gmail.com> написал:
> <http://org.apache.commons.io>.FileUtils;
> <mailto:janusgraph-users%2Bunsu...@googlegroups.com>
> > <mailto:janusgraph-use...@googlegroups.com
> <mailto:janusgraph-users%2Bunsu...@googlegroups.com>>.
> > To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/janusgraph-users/aa5daeac-ae57-4a30-ad14-91bd0e3a31b4%40googlegroups.com
>
> >
> <https://groups.google.com/d/msgid/janusgraph-users/aa5daeac-ae57-4a30-ad14-91bd0e3a31b4%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "JanusGraph users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to janusgraph-use...@googlegroups.com
> <mailto:janusgraph-users%2Bunsu...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/janusgraph-users/51caccd9-8364-cb9f-0610-ee4ae11bbd73%40gmail.com.
>
>
>
> --
> Thanks
> Akhilesh
>
> --
> You received this message because you are subscribed to the Google
> Groups "JanusGraph users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to janusgraph-use...@googlegroups.com
> <mailto:janusgraph-use...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/janusgraph-users/CAN8C252-EFtR5pff%3Dapg8s2Y%2BDB98Quj08PbBaTo5SoTOsOn9Q%40mail.gmail.com
> <https://groups.google.com/d/msgid/janusgraph-users/CAN8C252-EFtR5pff%3Dapg8s2Y%2BDB98Quj08PbBaTo5SoTOsOn9Q%40mail.gmail.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages