question about frame size setting

107 views

Skip to first unread message

andys...@gmail.com

unread,

Nov 17, 2014, 2:40:35 AM11/17/14

to aureliu...@googlegroups.com

Hi, I am using TitanHadoop to do graph statictics . The data is very big now.

I set some configuration below:

cassandra.input.split.size=512

cassandra.thrift.framed.size_mb=490

cassandra.thrift.message.max_size_mb=491

When I ran the groovy script in gremlin and saw the frame size warning, I set frame size more bigger . At last no frame warning , but I saw the exception below :

2014-11-15 15:48:56,804 INFO [elasticsearch[Aireo][generic][T#15]] org.elasticsearch.client.transport: [Aireo] failed to get local cluster state for [#transport#-1][store28.antfact.com][inet[es1/172.19.141.7:9300]], disconnecting...

org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[es1/172.19.141.7:9300]][cluster/state] request_id [614] timed out after [16863ms]

at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:369)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

2014-11-15 15:48:56,804 WARN [pool-23-thread-1] com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog: Could not read messages for timestamp [Timepoint[1416037719440000 μs]] (this read will be retried)

java.lang.OutOfMemoryError: GC overhead limit exceeded

at com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog$MessagePuller.run(KCVSLog.java:669)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

2014-11-15 15:48:56,804 INFO [elasticsearch[Aireo][generic][T#14]] org.elasticsearch.client.transport: [Aireo] failed to get local cluster state for [Colonel][B4AcCUaJTYi5VunJM9UpVA][es1][inet[/172.19.141.7:9300]], disconnecting...

org.elasticsearch.transport.ReceiveTimeoutTransportException: [Colonel][inet[/172.19.141.7:9300]][cluster/state] request_id [613] timed out after [16863ms]

at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:369)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

2014-11-15 15:49:53,162 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output

2014-11-15 15:49:53,163 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output

2014-11-15 15:49:53,163 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 380396; bufvoid = 1073741824

2014-11-15 15:49:53,163 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 268435452(1073741808); kvend = 268423236(1073692944); length = 12217/67108864

2014-11-15 15:49:53,207 INFO [main] org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library

2014-11-15 15:49:53,207 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate]

2014-11-15 15:49:53,297 INFO [main] org.apache.hadoop.mapred.MapTask: Finished spill 0

2014-11-15 15:49:53,299 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: java.lang.OutOfMemoryError: GC overhead limit exceeded

at org.apache.hadoop.mapreduce.lib.chain.Chain.joinAllThreads(Chain.java:526)

at org.apache.hadoop.mapreduce.lib.chain.ChainMapper.run(ChainMapper.java:169)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded

at java.util.Arrays.copyOf(Arrays.java:2367)

at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)

at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)

at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)

at java.lang.StringBuilder.append(StringBuilder.java:132)

at com.thinkaurelius.titan.diskstorage.configuration.ConfigElement.toString(ConfigElement.java:64)

at com.thinkaurelius.titan.diskstorage.configuration.ConfigOption.verify(ConfigOption.java:133)

at com.thinkaurelius.titan.diskstorage.configuration.ConfigOption.get(ConfigOption.java:127)

at com.thinkaurelius.titan.diskstorage.configuration.BasicConfiguration.get(BasicConfiguration.java:59)

at com.thinkaurelius.titan.hadoop.FaunusPathElement.setConf(FaunusPathElement.java:57)

at com.thinkaurelius.titan.hadoop.FaunusPathElement.<init>(FaunusPathElement.java:48)

at com.thinkaurelius.titan.hadoop.FaunusVertex.<init>(FaunusVertex.java:53)

at com.thinkaurelius.titan.hadoop.StandardFaunusEdge.getVertex(StandardFaunusEdge.java:78)

at com.thinkaurelius.titan.hadoop.StandardFaunusEdge.getVertex(StandardFaunusEdge.java:86)

at com.thinkaurelius.titan.hadoop.FaunusElement.addRelation(FaunusElement.java:178)

at com.thinkaurelius.titan.hadoop.FaunusVertex.addEdge(FaunusVertex.java:231)

at com.thinkaurelius.titan.hadoop.formats.util.TitanHadoopGraph.readHadoopVertex(TitanHadoopGraph.java:82)

at com.thinkaurelius.titan.hadoop.formats.cassandra.TitanCassandraHadoopGraph.readHadoopVertex(TitanCassandraHadoopGraph.java:30)

at com.thinkaurelius.titan.hadoop.formats.cassandra.TitanCassandraRecordReader.nextKeyValue(TitanCassandraRecordReader.java:49)

at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)

at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)

at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)

at org.apache.hadoop.mapreduce.lib.chain.Chain$ChainRecordReader.nextKeyValue(Chain.java:156)

at org.apache.hadoop.mapreduce.lib.chain.ChainMapContextImpl.nextKeyValue(ChainMapContextImpl.java:78)

at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)

at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

at org.apache.hadoop.mapreduce.lib.chain.Chain$MapRunner.run(Chain.java:321)

How to solve this OOM problem ?

Regards,

Andy

Matthias Broecheler

unread,

Nov 19, 2014, 2:27:38 AM11/19/14

to aureliu...@googlegroups.com

You probably want to assign more heap space to the hadoop vms.

--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/7cea21b8-5287-4ace-9a6f-605b780a50ee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.