java.io.EOFException - not a SequenceFile

415 kali dilihat
Langsung ke pesan pertama yang belum dibaca

Maurice Wolter

belum dibaca,
26 Mei 2016, 03.35.3626/05/16
kepadagobblin-users
Hi to you all,

I am running gobblin Kafka to HDFs ingestion as a regular job via Jenkins. My job failes regularly with a  java.io.EOFException. As the next run is always successful and the temporary files where deleted, I could not, yet track the cause of the error.
Has someone of you experienced a similar issue?

mxraw is the jobName ...

Here's the stack trace:
May 26, 2016 8:16:08 AM com.google.common.util.concurrent.AbstractScheduledService$1$1 run
WARNING: Error while attempting to shut down the service after failure.
java.io.IOException: java.io.EOFException: hdfs://nameservice1/data/mxraw/.ingestion/working/mxraw/output/job_mxraw_1464243302979/task_mxraw_1464243302979_7.tst not a SequenceFile
	at gobblin.util.ParallelRunner.close(ParallelRunner.java:291)
	at gobblin.runtime.TaskStateCollectorService.collectOutputTaskStates(TaskStateCollectorService.java:145)
	at gobblin.runtime.TaskStateCollectorService.runOneIteration(TaskStateCollectorService.java:81)
	at gobblin.runtime.TaskStateCollectorService.shutDown(TaskStateCollectorService.java:102)
	at com.google.common.util.concurrent.AbstractScheduledService$1$1.run(AbstractScheduledService.java:175)
	at com.google.common.util.concurrent.Callables$3.run(Callables.java:93)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException: hdfs://nameservice1/data/mxraw/.ingestion/working/mxraw/output/job_mxraw_1464243302979/task_mxraw_1464243302979_7.tst not a SequenceFile
	at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1852)
	at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1811)
	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1760)
	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1774)
	at gobblin.util.ParallelRunner$3.call(ParallelRunner.java:160)
	at gobblin.util.ParallelRunner$3.call(ParallelRunner.java:154)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	... 3 more

16/05/26 08:16:13 INFO mapreduce.Job: Job job_1461182855125_53342 completed successfully
16/05/26 08:16:13 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=4210550
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=83616628
		HDFS: Number of bytes written=1833615219
		HDFS: Number of read operations=132219
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=13269
	Job Counters 
		Launched map tasks=30
		Other local map tasks=30
		Total time spent by all maps in occupied slots (ms)=3701607
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=1233869
		Total vcore-seconds taken by all map tasks=1233869
		Total megabyte-seconds taken by all map tasks=3790445568
	Map-Reduce Framework
		Map input records=30
		Map output records=0
		Input split bytes=4710
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=4946
		CPU time spent (ms)=998150
		Physical memory (bytes) snapshot=24812699648
		Virtual memory (bytes) snapshot=82111893504
		Total committed heap usage (bytes)=25757220864
	File Input Format Counters 
		Bytes Read=37174
	File Output Format Counters 
		Bytes Written=0
16/05/26 08:16:13 ERROR runtime.AbstractJobLauncher: Failed to launch and run job job_mxraw_1464243302979: java.lang.IllegalStateException: Expected the service to be TERMINATED, but the service has FAILED
java.lang.IllegalStateException: Expected the service to be TERMINATED, but the service has FAILED
	at com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:334)
	at com.google.common.util.concurrent.AbstractService.awaitTerminated(AbstractService.java:303)
	at com.google.common.util.concurrent.AbstractScheduledService.awaitTerminated(AbstractScheduledService.java:402)
	at gobblin.runtime.mapreduce.MRJobLauncher.runWorkUnits(MRJobLauncher.java:227)
	at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:261)
	at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:60)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:133)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: java.io.EOFException: hdfs://nameservice1/data/mxraw/.ingestion/working/mxraw/output/job_mxraw_1464243302979/task_mxraw_1464243302979_15.tst not a SequenceFile
	at gobblin.util.ParallelRunner.close(ParallelRunner.java:291)
	at gobblin.runtime.TaskStateCollectorService.collectOutputTaskStates(TaskStateCollectorService.java:145)
	at gobblin.runtime.TaskStateCollectorService.runOneIteration(TaskStateCollectorService.java:81)
	at com.google.common.util.concurrent.AbstractScheduledService$1$1.run(AbstractScheduledService.java:172)
	at com.google.common.util.concurrent.Callables$3.run(Callables.java:93)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException: hdfs://nameservice1/data/mxraw/.ingestion/working/mxraw/output/job_mxraw_1464243302979/task_mxraw_1464243302979_15.tst not a SequenceFile
	at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1852)
	at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1811)
	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1760)
	at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1774)
	at gobblin.util.ParallelRunner$3.call(ParallelRunner.java:160)
	at gobblin.util.ParallelRunner$3.call(ParallelRunner.java:154)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	... 3 more

Maurice Wolter

belum dibaca,
26 Mei 2016, 03.38.1826/05/16
kepadagobblin-users
I forgot ... I am using

Hadoop    2.6.0-cdh5.7.0
kafka        0.9.0-kafka-2.0.1
Gobblin   0.6.2

Sahil Takiar

belum dibaca,
27 Mei 2016, 12.40.2427/05/16
kepadaMaurice Wolter, gobblin-users
What version of Gobblin are you using? There used to be a bug that caused this, but it was fixed. Can you make sure you are using version 0.7.0?

--Sahil

--
You received this message because you are subscribed to the Google Groups "gobblin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gobblin-user...@googlegroups.com.
To post to this group, send email to gobbli...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gobblin-users/50af77ac-6d72-46cb-a1e4-056f8af22d06%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Maurice Wolter

belum dibaca,
27 Mei 2016, 12.48.3927/05/16
kepadagobblin-users, maurice...@gmail.com
Hey,
thanks for the hint, I am indeed still using 0.6.2 but I already prepared everything for the update to 0.7, so I'see next week if its going to solve the problem...
Do you now the issue Number or pull request? I would like to know more  details on this ...

-- Maurice

Sahil Takiar

belum dibaca,
27 Mei 2016, 12.50.2727/05/16
kepadaMaurice Wolter, gobblin-users
This post on the Google Groups has more details about the 0.7.0 release https://groups.google.com/d/msg/gobblin-users/O0hpEm59WCM/wQgRO1uaCQAJ

Balas ke semua
Balas ke penulis
Teruskan
0 pesan baru