Getting errors in running Camus

100 views
Skip to first unread message

Prachi G

unread,
Nov 24, 2015, 2:09:32 AM11/24/15
to Camus - Kafka ETL for Hadoop
Hi,

When I am running camus, I am ~20% data loss. In logs, I found the following exceptions:

15/11/23 22:06:56 ERROR kafka.CamusJob: Error for EtlKey [topic=platform.enrch partition=47leaderId= server= service= beginOffset=1351096487 offset=1351096488 msgSize=1784 server= checksum=163847142 time=1448345184294 service=unknown_service message.size=1784 server=unknown_server]: Topic not fully pulled, max task time reached at 2015-11-23T22:06:24.294-08:00, pulled 6847230 records

15/11/23 22:06:56 ERROR kafka.CamusJob: Errors from file [/user/prachi/exec/2015-11-24-05-05-16/errors-m-00081]


What should I try to do here?


when I ran, hadoop fs -text /user/prachi/exec/2015-11-24-05-05-16/errors-m-00081:


15/11/23 23:09:15 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library

15/11/23 23:09:15 INFO compress.CodecPool: Got brand-new decompressor [.deflate]

-text: Fatal internal error

java.lang.RuntimeException: java.io.IOException: WritableName can't load class: com.linkedin.camus.etl.kafka.common.EtlKey

at org.apache.hadoop.io.SequenceFile$Reader.getKeyClass(SequenceFile.java:2014)

at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1945)

at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1811)

at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1760)

at org.apache.hadoop.fs.shell.Display$TextRecordInputStream.<init>(Display.java:222)

at org.apache.hadoop.fs.shell.Display$Text.getInputStream(Display.java:152)

at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:101)

at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:306)

at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:278)

at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:260)

at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:244)

at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:190)

at org.apache.hadoop.fs.shell.Command.run(Command.java:154)

at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)

at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)

Caused by: java.io.IOException: WritableName can't load class: com.linkedin.camus.etl.kafka.common.EtlKey

at org.apache.hadoop.io.WritableName.getClass(WritableName.java:77)

at org.apache.hadoop.io.SequenceFile$Reader.getKeyClass(SequenceFile.java:2012)

... 16 more

Caused by: java.lang.ClassNotFoundException: Class com.linkedin.camus.etl.kafka.common.EtlKey not found

at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1982)

at org.apache.hadoop.io.WritableName.getClass(WritableName.java:75)

... 17 more

Prachi G

unread,
Dec 1, 2015, 8:21:31 AM12/1/15
to Camus - Kafka ETL for Hadoop
Basically, exception is coming from: https://github.com/linkedin/camus/blob/master/camus-etl-kafka/src/main/java/com/linkedin/camus/etl/kafka/mapred/EtlRecordReader.java#L335

But in configs, I have set it as -1. 
# Max minutes for each mapper to pull messages (-1 means no limit)
kafka
.max.pull.minutes.per.task=-1

Infact, I tried very big positive number also. Can someone please pointout the mistake which I am doing?
Reply all
Reply to author
Forward
0 new messages