question about frame size setting

107 views
Skip to first unread message

andys...@gmail.com

unread,
Nov 17, 2014, 2:40:35 AM11/17/14
to aureliu...@googlegroups.com
Hi, I am using TitanHadoop to do graph statictics . The data is very big now. 
I set some configuration below:
cassandra.input.split.size=512
cassandra.thrift.framed.size_mb=490
cassandra.thrift.message.max_size_mb=491

When I ran the groovy script in gremlin and saw the frame size warning, I set frame size more bigger . At last no frame warning , but I saw the exception below : 

2014-11-15 15:48:56,804 INFO [elasticsearch[Aireo][generic][T#15]] org.elasticsearch.client.transport: [Aireo] failed to get local cluster state for [#transport#-1][store28.antfact.com][inet[es1/172.19.141.7:9300]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [][inet[es1/172.19.141.7:9300]][cluster/state] request_id [614] timed out after [16863ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:369)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
2014-11-15 15:48:56,804 WARN [pool-23-thread-1] com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog: Could not read messages for timestamp [Timepoint[1416037719440000 μs]] (this read will be retried)
java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog$MessagePuller.run(KCVSLog.java:669)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
2014-11-15 15:48:56,804 INFO [elasticsearch[Aireo][generic][T#14]] org.elasticsearch.client.transport: [Aireo] failed to get local cluster state for [Colonel][B4AcCUaJTYi5VunJM9UpVA][es1][inet[/172.19.141.7:9300]], disconnecting...
org.elasticsearch.transport.ReceiveTimeoutTransportException: [Colonel][inet[/172.19.141.7:9300]][cluster/state] request_id [613] timed out after [16863ms]
at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:369)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
2014-11-15 15:49:53,162 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output
2014-11-15 15:49:53,163 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output
2014-11-15 15:49:53,163 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 380396; bufvoid = 1073741824
2014-11-15 15:49:53,163 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 268435452(1073741808); kvend = 268423236(1073692944); length = 12217/67108864
2014-11-15 15:49:53,207 INFO [main] org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2014-11-15 15:49:53,207 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.deflate]
2014-11-15 15:49:53,297 INFO [main] org.apache.hadoop.mapred.MapTask: Finished spill 0
2014-11-15 15:49:53,299 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.hadoop.mapreduce.lib.chain.Chain.joinAllThreads(Chain.java:526)
at org.apache.hadoop.mapreduce.lib.chain.ChainMapper.run(ChainMapper.java:169)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.Arrays.copyOf(Arrays.java:2367)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415)
at java.lang.StringBuilder.append(StringBuilder.java:132)
at com.thinkaurelius.titan.diskstorage.configuration.ConfigElement.toString(ConfigElement.java:64)
at com.thinkaurelius.titan.diskstorage.configuration.ConfigOption.verify(ConfigOption.java:133)
at com.thinkaurelius.titan.diskstorage.configuration.ConfigOption.get(ConfigOption.java:127)
at com.thinkaurelius.titan.diskstorage.configuration.BasicConfiguration.get(BasicConfiguration.java:59)
at com.thinkaurelius.titan.hadoop.FaunusPathElement.setConf(FaunusPathElement.java:57)
at com.thinkaurelius.titan.hadoop.FaunusPathElement.<init>(FaunusPathElement.java:48)
at com.thinkaurelius.titan.hadoop.FaunusVertex.<init>(FaunusVertex.java:53)
at com.thinkaurelius.titan.hadoop.StandardFaunusEdge.getVertex(StandardFaunusEdge.java:78)
at com.thinkaurelius.titan.hadoop.StandardFaunusEdge.getVertex(StandardFaunusEdge.java:86)
at com.thinkaurelius.titan.hadoop.FaunusElement.addRelation(FaunusElement.java:178)
at com.thinkaurelius.titan.hadoop.FaunusVertex.addEdge(FaunusVertex.java:231)
at com.thinkaurelius.titan.hadoop.formats.util.TitanHadoopGraph.readHadoopVertex(TitanHadoopGraph.java:82)
at com.thinkaurelius.titan.hadoop.formats.cassandra.TitanCassandraHadoopGraph.readHadoopVertex(TitanCassandraHadoopGraph.java:30)
at com.thinkaurelius.titan.hadoop.formats.cassandra.TitanCassandraRecordReader.nextKeyValue(TitanCassandraRecordReader.java:49)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.lib.chain.Chain$ChainRecordReader.nextKeyValue(Chain.java:156)
at org.apache.hadoop.mapreduce.lib.chain.ChainMapContextImpl.nextKeyValue(ChainMapContextImpl.java:78)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapreduce.lib.chain.Chain$MapRunner.run(Chain.java:321)

How to solve this OOM problem ? 

   Regards,
   Andy

Matthias Broecheler

unread,
Nov 19, 2014, 2:27:38 AM11/19/14
to aureliu...@googlegroups.com
You probably want to assign more heap space to the hadoop vms.

--
You received this message because you are subscribed to the Google Groups "Aurelius" group.
To unsubscribe from this group and stop receiving emails from it, send an email to aureliusgraph...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/aureliusgraphs/7cea21b8-5287-4ace-9a6f-605b780a50ee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Matthias Broecheler
http://www.matthiasb.com
Reply all
Reply to author
Forward
0 new messages