Hey,
We're evaluating various graph databases, and I'm having (what seems like) performance problems with Titan. At the moment, I've only been evaluating the batch ingest rate on a dataset of 2M nodes, but I've only been able to insert part of my data set before a SocketTimeoutException occurs. It has been inserting at maybe 1000 vertices/second, but various sources I've found online indicate that I should expect the performance to be much higher (such as http://architects.dzone.com/articles/educating-planet-and-graph, which indicated 1.2M edges/second). The CPU/memory usage is not particularly high on the titan nodes (maybe 50-80% CPU according to top, and 30% memory).
Additionally, I keep getting exceptions during the insert process indicating socket timeouts with cassandra (similar to https://github.com/thinkaurelius/titan/issues/250). I'm using BatchGraph with various transaction sizes, but they all seem to have the same exceptions. I've adjusted storage.buffer-size to 131072 which helped, but it still eventually dies. storage.batch-loading doesn't seem to make any difference in performance speed.
I'm currently running the cluster on 4 rackspace nodes with 8GB RAM and 4 cores. I'm using Titan 0.3.2 on Java 6, with embedded cassandra configuration for titan. I've noticed the timeout exceptions happen less when I run the insertion program from one of the nodes instead of an external machine, but they still happen.
I've found advice at various places, and gotten slightly better performance, but not by much:
https://groups.google.com/forum/#!topic/aureliusgraphs/FOBy4VBQP44
https://groups.google.com/forum/#!topic/aureliusgraphs/n2M2SS-X_2M
My import code looks roughly like this:
Configuration conf = new BaseConfiguration();
conf.setProperty("storage.backend", "cassandrathrift");
conf.setProperty("storage.hostname", "xx.xx.xx.xx");
conf.setProperty("storage.connection-timeout", "10000");
TitanGraph g = TitanFactory.open(conf);
// good enough for now
TitanType t = g.getType(V_INDEX);
if (t == null)
{
g.makeType().name(V_INDEX).indexed(Vertex.class).unique(Direction.OUT).dataType(String.class).makePropertyKey();
g.makeType().name("connects").makeEdgeLabel();
g.commit();
}
// have tried various transaction sizes here
tg = new BatchGraph<TransactionalGraph>(g, VertexIDType.STRING, 25000);
line = reader.readLine();
String a, b, c;
String[] split;
Vertex aV, bV;
Edge edge;
String edgeID;
HashSet<String> edgeIDs = new HashSet<String>();
while (line != null)
{
split = line.split("\t");
a = split[0];
b = split[1];
c = split[10];
aV = tg.getVertex(a);
if (aV == null)
{
// sometimes the exception happens here
aV = tg.addVertex(a);
aV.setProperty(V_INDEX, a);
}
bV = tg.getVertex(b);
if (bV == null)
{
// sometimes the exception happens here
bV = tg.addVertex(b);
bV.setProperty(V_INDEX, b);
}
edgeID = a + "|" + b;
if (!edgeIDs.contains(edgeID)) {
// sometimes the exception happens here
edge = tg.addEdge(edgeID, aV, bV, "connects");
edge.setProperty(E_A_INDEX, a);
edge.setProperty(E_B_INDEX, b);
edge.setProperty(E_C_INDEX, c);
edgeIDs.add(edgeID);
}
line = reader.readLine();
}
tg.shutdown();
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
at java.lang.Thread.run(Thread.java:662)
Caused by: com.thinkaurelius.titan.core.TitanException: Could not commit transaction due to exception during persistence
at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.commit(StandardTitanTx.java:848)
at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsGraph.commit(TitanBlueprintsGraph.java:64)
at com.thinkaurelius.titan.graphdb.blueprints.TitanBlueprintsGraph.stopTransaction(TitanBlueprintsGraph.java:91)
at com.tinkerpop.blueprints.util.wrappers.batch.BatchGraph.nextElement(BatchGraph.java:213)
at com.tinkerpop.blueprints.util.wrappers.batch.BatchGraph.addVertex(BatchGraph.java:338)
at main.Importer.main(Importer.java:448)
... 6 more
Caused by: com.thinkaurelius.titan.core.TitanException: Unexpected exception during backend operation
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:66)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.save(StandardTitanGraph.java:277)
at com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx.commit(StandardTitanTx.java:839)
... 12 more
Caused by: com.thinkaurelius.titan.core.TitanException: Permanent exception during backend operation
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:64)
at com.thinkaurelius.titan.diskstorage.keycolumnvalue.BufferTransaction.flushInternal(BufferTransaction.java:96)
at com.thinkaurelius.titan.diskstorage.keycolumnvalue.BufferTransaction.mutate(BufferTransaction.java:84)
at com.thinkaurelius.titan.diskstorage.keycolumnvalue.BufferedKeyColumnValueStore.mutate(BufferedKeyColumnValueStore.java:47)
at com.thinkaurelius.titan.diskstorage.keycolumnvalue.CachedKeyColumnValueStore.mutate(CachedKeyColumnValueStore.java:97)
at com.thinkaurelius.titan.diskstorage.locking.consistentkey.ConsistentKeyLockStore.mutate(ConsistentKeyLockStore.java:121)
at com.thinkaurelius.titan.diskstorage.BackendTransaction.mutateEdges(BackendTransaction.java:99)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.persist(StandardTitanGraph.java:315)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.access$000(StandardTitanGraph.java:45)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph$2.call(StandardTitanGraph.java:270)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph$2.call(StandardTitanGraph.java:203)
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:61)
... 14 more
Caused by: com.thinkaurelius.titan.diskstorage.PermanentStorageException: Permanent failure in storage backend
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.convertException(CassandraThriftKeyColumnValueStore.java:270)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager.mutateMany(CassandraThriftStoreManager.java:162)
at com.thinkaurelius.titan.diskstorage.keycolumnvalue.BufferTransaction$1.call(BufferTransaction.java:99)
at com.thinkaurelius.titan.diskstorage.keycolumnvalue.BufferTransaction$1.call(BufferTransaction.java:96)
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:61)
... 25 more
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
at com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager.mutateMany(CassandraThriftStoreManager.java:160)
... 28 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 39 more
Any thoughts/comments you have would be appreciated. Thanks!
Dustin
--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.