kafka.max.pull.minutes.per.task=-1
kafka.max.pull.hrs=1
Any other suggestions?
Thanks,
Roger
$ HADOOP_CONF_DIR=/etc/bdf/hadoop/conf hadoop jar /opt/bdf/hadoop/target/hadoop-0.0.1-SNAPSHOT.jar com.linkedin.camus.etl.kafka.CamusJob -P /etc/bdf/hadoop/camus.properties
14/02/19 22:40:57 INFO kafka.CamusJob: Dir Destination set to: /user/bigdatafoundry/sit/ingestion/data
14/02/19 22:40:58 INFO kafka.CamusJob: removing old execution: 2014-02-03-15-20-07
14/02/19 22:40:58 INFO kafka.CamusJob: Previous execution: hdfs://nameservice1/user/bigdatafoundry/sit/ingestion/metadata/history/2014-02-19-21-50-08
14/02/19 22:40:58 INFO kafka.CamusJob: New execution temp location: /user/bigdatafoundry/sit/ingestion/metadata/2014-02-19-22-40-58
14/02/19 22:40:58 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
14/02/19 22:40:59 INFO mapred.EtlInputFormat: Fetching metadata from broker 13.7.140.149:9092 with client id -stream-ingester for 0 topic(s) []
14/02/19 22:41:00 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
14/02/19 22:41:00 INFO compress.CodecPool: Got brand-new compressor [.deflate]
14/02/19 22:41:00 INFO mapred.EtlInputFormat: previous offset file:hdfs://nameservice1/user/bigdatafoundry/sit/ingestion/metadata/history/2014-02-19-21-50-08/offsets-previous
14/02/19 22:41:00 INFO compress.CodecPool: Got brand-new decompressor [.deflate]
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.raw.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:0 offset:40 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.raw.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:1 offset:40 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.raw.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:2 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.raw.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:3 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.raw.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:4 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.raw.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:5 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.raw.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:6 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.raw.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:7 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.validated.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:0 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.validated.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:1 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.validated.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:2 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.validated.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:3 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.validated.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:4 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.validated.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:5 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.validated.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:6 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.EtlInputFormat: bdf.validated.msgs uri:tcp://13.7.140.149:9092 leader:168427565 partition:7 offset:0 latest_offset:125
14/02/19 22:41:00 INFO mapred.JobClient: Running job: job_201402042151_8376
14/02/19 22:41:01 INFO mapred.JobClient: map 0% reduce 0%
14/02/19 22:41:14 INFO mapred.JobClient: map 32% reduce 0%
14/02/19 22:41:17 INFO mapred.JobClient: map 39% reduce 0%
14/02/19 22:41:23 INFO mapred.JobClient: map 47% reduce 0%
14/02/19 22:41:25 INFO mapred.JobClient: map 100% reduce 0%
14/02/19 22:41:25 INFO mapred.JobClient: Job complete: job_201402042151_8376
14/02/19 22:41:26 INFO mapred.JobClient: Counters: 27
14/02/19 22:41:26 INFO mapred.JobClient: File System Counters
14/02/19 22:41:26 INFO mapred.JobClient: FILE: Number of bytes read=0
14/02/19 22:41:26 INFO mapred.JobClient: FILE: Number of bytes written=195637
14/02/19 22:41:26 INFO mapred.JobClient: FILE: Number of read operations=0
14/02/19 22:41:26 INFO mapred.JobClient: FILE: Number of large read operations=0
14/02/19 22:41:26 INFO mapred.JobClient: FILE: Number of write operations=0
14/02/19 22:41:26 INFO mapred.JobClient: HDFS: Number of bytes read=1217
14/02/19 22:41:26 INFO mapred.JobClient: HDFS: Number of bytes written=707799
14/02/19 22:41:26 INFO mapred.JobClient: HDFS: Number of read operations=1
14/02/19 22:41:26 INFO mapred.JobClient: HDFS: Number of large read operations=0
14/02/19 22:41:26 INFO mapred.JobClient: HDFS: Number of write operations=9
14/02/19 22:41:26 INFO mapred.JobClient: Job Counters
14/02/19 22:41:26 INFO mapred.JobClient: Launched map tasks=1
14/02/19 22:41:26 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=22230
14/02/19 22:41:26 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0
14/02/19 22:41:26 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
14/02/19 22:41:26 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
14/02/19 22:41:26 INFO mapred.JobClient: Map-Reduce Framework
14/02/19 22:41:26 INFO mapred.JobClient: Map input records=912
14/02/19 22:41:26 INFO mapred.JobClient: Map output records=924
14/02/19 22:41:26 INFO mapred.JobClient: Input split bytes=1217
14/02/19 22:41:26 INFO mapred.JobClient: Spilled Records=0
14/02/19 22:41:26 INFO mapred.JobClient: CPU time spent (ms)=4890
14/02/19 22:41:26 INFO mapred.JobClient: Physical memory (bytes) snapshot=386433024
14/02/19 22:41:26 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1631154176
14/02/19 22:41:26 INFO mapred.JobClient: Total committed heap usage (bytes)=991821824
14/02/19 22:41:26 INFO mapred.JobClient: total
14/02/19 22:41:26 INFO mapred.JobClient: data-read=1096446
14/02/19 22:41:26 INFO mapred.JobClient: decode-time(ms)=137
14/02/19 22:41:26 INFO mapred.JobClient: event-count=1767
14/02/19 22:41:26 INFO mapred.JobClient: request-time(ms)=18248
14/02/19 22:41:26 INFO kafka.CamusJob: Group: File System Counters
14/02/19 22:41:26 INFO kafka.CamusJob: FILE: Number of bytes read: 0
14/02/19 22:41:26 INFO kafka.CamusJob: FILE: Number of bytes written: 195637
14/02/19 22:41:26 INFO kafka.CamusJob: FILE: Number of read operations: 0
14/02/19 22:41:26 INFO kafka.CamusJob: FILE: Number of large read operations: 0
14/02/19 22:41:26 INFO kafka.CamusJob: FILE: Number of write operations: 0
14/02/19 22:41:26 INFO kafka.CamusJob: HDFS: Number of bytes read: 1217
14/02/19 22:41:26 INFO kafka.CamusJob: HDFS: Number of bytes written: 707799
14/02/19 22:41:26 INFO kafka.CamusJob: HDFS: Number of read operations: 1
14/02/19 22:41:26 INFO kafka.CamusJob: HDFS: Number of large read operations: 0
14/02/19 22:41:26 INFO kafka.CamusJob: HDFS: Number of write operations: 9
14/02/19 22:41:26 INFO kafka.CamusJob: Group: Job Counters
14/02/19 22:41:26 INFO kafka.CamusJob: Launched map tasks: 1
14/02/19 22:41:26 INFO kafka.CamusJob: Total time spent by all maps in occupied slots (ms): 22230
14/02/19 22:41:26 INFO kafka.CamusJob: Total time spent by all reduces in occupied slots (ms): 0
14/02/19 22:41:26 INFO kafka.CamusJob: Total time spent by all maps waiting after reserving slots (ms): 0
14/02/19 22:41:26 INFO kafka.CamusJob: Total time spent by all reduces waiting after reserving slots (ms): 0
14/02/19 22:41:26 INFO kafka.CamusJob: Group: Map-Reduce Framework
14/02/19 22:41:26 INFO kafka.CamusJob: Map input records: 912
14/02/19 22:41:26 INFO kafka.CamusJob: Map output records: 924
14/02/19 22:41:26 INFO kafka.CamusJob: Input split bytes: 1217
14/02/19 22:41:26 INFO kafka.CamusJob: Spilled Records: 0
14/02/19 22:41:26 INFO kafka.CamusJob: CPU time spent (ms): 4890
14/02/19 22:41:26 INFO kafka.CamusJob: Physical memory (bytes) snapshot: 386433024
14/02/19 22:41:26 INFO kafka.CamusJob: Virtual memory (bytes) snapshot: 1631154176
14/02/19 22:41:26 INFO kafka.CamusJob: Total committed heap usage (bytes): 991821824
14/02/19 22:41:26 INFO kafka.CamusJob: Group: total
14/02/19 22:41:26 INFO kafka.CamusJob: data-read: 1096446
14/02/19 22:41:26 INFO kafka.CamusJob: decode-time(ms): 137
14/02/19 22:41:26 INFO kafka.CamusJob: event-count: 1767
14/02/19 22:41:26 INFO kafka.CamusJob: request-time(ms): 18248
topic=bdf.validated.msgs partition=0leaderId=168427565 server= service= beginOffset=0 offset=1 server= checksum=634667686 time=1392849674100
java.io.IOException: java.lang.IndexOutOfBoundsException
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:128)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.IndexOutOfBoundsException
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:163)
at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:184)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
... 12 more
topic=bdf.validated.msgs partition=0leaderId=168427565 server= service= beginOffset=1 offset=2 server= checksum=1167688791 time=1392849674105
java.io.IOException: java.lang.IndexOutOfBoundsException
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:128)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.IndexOutOfBoundsException
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:163)
at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:184)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
... 12 more
topic=bdf.validated.msgs partition=0leaderId=168427565 server= service= beginOffset=2 offset=3 server= checksum=2124285611 time=1392849674106
java.io.IOException: java.lang.IndexOutOfBoundsException
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:128)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.IndexOutOfBoundsException
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:163)
at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:184)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
... 12 more
topic=bdf.validated.msgs partition=0leaderId=168427565 server= service= beginOffset=3 offset=4 server= checksum=871654287 time=1392849674106
java.io.IOException: java.lang.IndexOutOfBoundsException
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:128)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.IndexOutOfBoundsException
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:163)
at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:184)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
... 12 more
topic=bdf.validated.msgs partition=0leaderId=168427565 server= service= beginOffset=4 offset=5 server= checksum=69324309 time=1392849674107
java.io.IOException: java.lang.IndexOutOfBoundsException
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:128)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.IndexOutOfBoundsException
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:163)
at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:184)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
... 12 more
topic=bdf.validated.msgs partition=0leaderId=168427565 server= service= beginOffset=5 offset=6 server= checksum=420984128 time=1392849674107
java.io.IOException: java.lang.IndexOutOfBoundsException
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:128)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.IndexOutOfBoundsException
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:163)
at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:184)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
... 12 more
topic=bdf.validated.msgs partition=0leaderId=168427565 server= service= beginOffset=6 offset=7 server= checksum=3635816992 time=1392849674108
java.io.IOException: java.lang.IndexOutOfBoundsException
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:128)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.IndexOutOfBoundsException
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:163)
at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:184)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
... 12 more
topic=bdf.validated.msgs partition=0leaderId=168427565 server= service= beginOffset=7 offset=8 server= checksum=3965401232 time=1392849674108
java.io.IOException: java.lang.IndexOutOfBoundsException
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:128)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.IndexOutOfBoundsException
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:163)
at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:184)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
... 12 more
topic=bdf.validated.msgs partition=0leaderId=168427565 server= service= beginOffset=8 offset=9 server= checksum=168114254 time=1392849674109
java.io.IOException: java.lang.IndexOutOfBoundsException
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:128)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.IndexOutOfBoundsException
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:163)
at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:184)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
... 12 more
topic=bdf.validated.msgs partition=0leaderId=168427565 server= service= beginOffset=9 offset=10 server= checksum=1685958109 time=1392849674110
java.io.IOException: java.lang.IndexOutOfBoundsException
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:128)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.IndexOutOfBoundsException
at java.io.ByteArrayInputStream.read(ByteArrayInputStream.java:163)
at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:184)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:262)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
... 12 more
topic=bdf.validated.msgs partition=1leaderId=168427565 server= service= beginOffset=77 offset=78 server= checksum=2967038340 time=1392849675153
java.lang.Exception: Java heap space
at org.apache.avro.util.Utf8.setByteLength(Utf8.java:77)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:260)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.OutOfMemoryError: Java heap space
... 24 more
topic=bdf.validated.msgs partition=2leaderId=168427565 server= service= beginOffset=32 offset=33 server= checksum=2297811482 time=1392849675328
java.lang.Exception: Java heap space
at org.apache.avro.util.Utf8.setByteLength(Utf8.java:77)
at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:260)
at org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:107)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:344)
at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:337)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150)
at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:173)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:144)
at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:135)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:74)
at com.parc.bdf.ingestion.hadoop.CamusMessageDecoder.decode(CamusMessageDecoder.java:36)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.getWrappedRecord(EtlRecordReader.java:125)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:255)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:484)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:76)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:85)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:139)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:673)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:331)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.OutOfMemoryError: Java heap space
... 24 more
14/02/19 22:41:26 INFO kafka.CamusJob: Job finished
14/02/19 22:41:26 INFO kafka.CamusJob: ***********Timing Report*************
Job time (seconds):
pre setup 2.0 (7%)
get splits 1.0 (3%)
hadoop job 25.0 (86%)
commit 0.0 (0%)
Total: 0 minutes 29 seconds
Hadoop job task times (seconds):
min 19.0
mean 19.0
max 19.0
skew 19.0/19.0 = 1.00
Task wait time (seconds):
min 4.7
mean 4.7
max 4.7
Hadoop task breakdown:
kafka 96%
decode 1%
map output 0%
other 3%
Total MB read: 1
--You received this message because you are subscribed to the Google Groups "Camus - Kafka ETL for Hadoop" group.
To unsubscribe from this group and stop receiving emails from it, send an email to camus_etl+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.