Config to set cassandra thrift max length

Abirami Senthil

unread,

Mar 3, 2016, 6:50:45 AM3/3/16

to Aurelius

hi team,

I am using titan -1.0.0 to read the graph from cassandra using spark graph computer

My script is -

graph = GraphFactory.open('conf/hadoop-graph/read-cassandra.properties')

g = graph.traversal(computer(SparkGraphComputer))

g.V().has("name","root").both().count()

The vertex "root" has 200000 edges.

and I am getting the exception as below. How do I set the max length of thrift message? Do you have any reference?

==>hadoopgraph[cassandrainputformat->gryooutputformat]

==>graphtraversalsource[hadoopgraph[cassandrainputformat->gryooutputformat], sparkgraphcomputer]

==>1456989198485

[Stage 0:===============================> 23:14:37 ERROR org.apache.spark.executor.Executor - Exception in task 292.0 in stage 0.0 (TID 292)

java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Frame size (18440949) larger than max length (15728640)!

at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:402)

at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:408)

at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRecordReader.java:331)

at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)

at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)

at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:177)

configuration file -

#

# Hadoop Graph Configuration

#

gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph

gremlin.hadoop.graphInputFormat=com.thinkaurelius.titan.hadoop.formats.cassandra.CassandraInputFormat

gremlin.hadoop.graphOutputFormat=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat

gremlin.hadoop.memoryOutputFormat=org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat

gremlin.hadoop.deriveMemory=false

gremlin.hadoop.jarsInDistributedCache=true

gremlin.hadoop.inputLocation=none

gremlin.hadoop.outputLocation=output

gremlin.hadoop.inputLocationRequired=false

#

# Titan Cassandra InputFormat configuration

#

titanmr.ioformat.conf.storage.backend=cassandrathrift

titanmr.ioformat.conf.storage.hostname=10.25.152.154

titanmr.ioformat.conf.storage.port=9160

#

# Apache Cassandra InputFormat configuration

#

cassandra.input.partitioner.class=org.apache.cassandra.dht.Murmur3Partitioner

#

# SparkGraphComputer Configuration

#

spark.master=local[4]

spark.serializer=org.apache.spark.serializer.KryoSerializer

spark.executor.memory=1g

Matt Aldridge

unread,

Mar 3, 2016, 4:52:08 PM3/3/16

to Aurelius

I haven't tried it with Titan 1.0 yet, but with titan-hadoop and Titan 0.5 I set the cassandra.thrift.framed.size_mb property to increase the frame size.

Jean-Baptiste Musso

unread,

Mar 3, 2016, 6:24:12 PM3/3/16

to aureliu...@googlegroups.com

Abirami,

You might want to have a look at this thread and the answer by Jason Plurad:

https://groups.google.com/d/msg/aureliusgraphs/LEiO42jt9Ao/HPCy0eJC_a8J

This was for Titan v0.5.x but I suppose this still applies for v1.0.0.
Basically, when using Hadoop (OLAP), you need to tweak:

titan.hadoop.input.conf.storage.cassandra.thrift.frame-size=20
titan.hadoop.output.conf.storage.cassandra.thrift.frame-size=20

Notice how this differs from:

cassandra.thrift.framed.size_mb

Jean-Baptiste

> --
> You received this message because you are subscribed to the Google Groups
> "Aurelius" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to aureliusgraph...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/aureliusgraphs/d2090c88-75ac-4607-a657-feb29e10bbef%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Abirami Senthil

unread,

Mar 4, 2016, 4:24:39 AM3/4/16

to Aurelius

Thanks Jean-Baptiste.

The name was slightly different for titan 1.0.0 version. I used,

storage.cassandra.frame-size-mb = 200

and it worked.

But, I couldn't find the thrift setting for the OLAP Hadoop graph. What needs to be appended for the hadoop graph to pick up the frame size settings?

John

unread,

Mar 10, 2016, 11:10:55 AM3/10/16

to Aurelius

Hello Abirami,

Did you ever get past the OLAP thrift frame size error? If so, can you please post the config options related to thrift, from the properties file?

Thanks.

Reply all

Reply to author

Forward