No AbstractFileSystem for scheme: null

Gabriele Tiberti

unread,

Sep 10, 2015, 10:33:10 AM9/10/15

to gobblin-users

Hello everybody!!

I'm quite new to Kafka - Gobblin environment. I'm a situation where I have millions of messages arriving at Kafka and I'd like to copy them on my hdfs in order to analyse them via a Spark batch job.

The messages in Kafka are json strings, and I want to copy them as avro files in the hadoop cluster.

I'm trying to set up the system using the default classes KafkaSimpleSource and AvroHdfsDataWriter, as my needs are really simple so I think I could avoid write classes for them.

I set up the job and the properties file but I continuosly receive this error:

WARN [KafkaSource] Previous offset for partition impression_2015-09-08:0 does not exist. This partition will start from the earliest offset: 0

WARN [KafkaSource] Avg event size for partition impression_2015-09-08:0 not available, using default size 1024

WARN [UserGroupInformation] PriviledgedActionException as:gabriele (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: null

ERROR [AbstractJobLauncher] Failed to launch and run job job_KafkaHdfsTest_1441895484485: org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: null

org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: null

and after some rows in the stack trace:

Exception in thread "main" java.lang.IllegalArgumentException: Missing required property writer.staging.dir

Even though the writer.staging.dir is set up in the properties.

Anybody has a suggestion?

Ziyang Liu

unread,

Sep 10, 2015, 11:46:19 AM9/10/15

to gobblin-users

Hi Gabriele, what's the value of fs.uri in your job config?

-Ziyang

Seong Hwan Cho

unread,

Sep 10, 2015, 11:48:33 AM9/10/15

to gobblin-users

The exception looks very similar to the one I'm having right now. (Please refer to the very next post)

I assume that you are running in MR mode.

Is the property writer.staging.dir set in the job configuration file or in the gobblin-mapreduce.properties file?

Gabriele Tiberti

unread,

Sep 10, 2015, 11:51:00 AM9/10/15

to gobblin-users

Hi Ziyang,

This is my job.pull file:

job.name=KafkaHdfsTest

job.group=Kafka

job.description=Kafka Extractor for Gobblin

job.lock.enabled=false

source.class=gobblin.source.extractor.extract.kafka.KafkaSimpleSource

converter.classes=gobblin.converter.IdentityConverter

extract.namespace=gobblin.extract.kafka

fs.uri=hdfs://xxx.xxx.xxx.com

writer.destination.type=HDFS

writer.output.format=AVRO

writer.fs.uri=hdfs://xxx.xxx.xxx.com

writer.staging.dir=/user/gabriele/gobblinStaging

writer.output.dir=/user/gabriele/gobblinTest

data.publisher.type=gobblin.publisher.BaseDataPublisher

topic.whitelist=impression_2015-09-08

topic.name=impression_2015-09-08

bootstrap.with.offset=earliest

kafka.brokers=xxx.xxx.xxx.com:9092

writer.builder.class=gobblin.writer.SimpleDataWriterBuilder

mr.job.max.mappers=20

and my properties for this job are:

# Thread pool settings for the task executor

taskexecutor.threadpool.size=2

taskretry.threadpool.coresize=1

taskretry.threadpool.maxsize=2

# File system URIs

fs.uri=hdfs://xxx.xxx.xxx.com

writer.fs.uri=${fs.uri}

state.store.fs.uri=${fs.uri}

# Writer related configuration properties

writer.destination.type=HDFS

writer.output.format=AVRO

writer.staging.dir=$GOBBLIN_WORK_DIR/task-staging

writer.output.dir=$GOBBLIN_WORK_DIR/task-output

# Data publisher related configuration properties

data.publisher.type=gobblin.publisher.BaseDataPublisher

data.publisher.final.dir=$GOBBLIN_WORK_DIR/job-output

data.publisher.replace.final.dir=false

# Directory where job/task state files are stored

state.store.dir=$GOBBLIN_WORK_DIR/state-store

# Directory where error files from the quality checkers are stored

qualitychecker.row.err.file=$GOBBLIN_WORK_DIR/err

# Directory where job locks are stored

job.lock.dir=$GOBBLIN_WORK_DIR/locks

# Directory where metrics log files are stored

metrics.log.dir=$GOBBLIN_WORK_DIR/metrics

# Interval of task state reporting in milliseconds

task.status.reportintervalinms=5000

# MapReduce properties

mr.job.root.dir=$GOBBLIN_WORK_DIR/working

and I set GOBBLIN_WORK_DIR as the folder on the hdfs://xxx..etc

Message has been deleted

Vamsikrushna L

unread,

Dec 29, 2015, 1:25:20 AM12/29/15

to gobblin-users

Hello,

Even I am getting the same error.

Please let me know if you know te solution.

Thanks in advance!

Sahil Takiar

unread,

Jan 5, 2016, 11:58:48 PM1/5/16

to Vamsikrushna L, gobblin-users

There is now an entry in the FAQ page about how to resolve this issue: https://github.com/linkedin/gobblin/wiki/FAQs#how-do-i-fix-unsupportedfilesystemexception-no-abstractfilesystem-for-scheme-null

--Sahil

--
You received this message because you are subscribed to the Google Groups "gobblin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gobblin-user...@googlegroups.com.
To post to this group, send email to gobbli...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gobblin-users/a9837255-13ec-469b-a905-a4549acb1785%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Vamsikrushna L

unread,

Jan 6, 2016, 1:22:26 AM1/6/16

to gobblin-users, vamsi....@gmail.com

Hi Sahil

Thanks a lot for your reply.

I could able to fix this problem, but I am getting below exception.

Please check this.

at gobblin.runtime.AbstractJobLauncher.runWorkUnits(AbstractJobLauncher.java:579)

at gobblin.runtime.mapreduce.MRJobLauncher$TaskRunner.run(MRJobLauncher.java:546)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Error: java.io.IOException: Not all tasks running in container attempt_1451645045308_0046_m_000000_1 completed successfully

at gobblin.runtime.AbstractJobLauncher.runWorkUnits(AbstractJobLauncher.java:579)

at gobblin.runtime.mapreduce.MRJobLauncher$TaskRunner.run(MRJobLauncher.java:546)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

Error: java.io.IOException: Not all tasks running in container attempt_1451645045308_0046_m_000000_2 completed successfully

at gobblin.runtime.AbstractJobLauncher.runWorkUnits(AbstractJobLauncher.java:579)

at gobblin.runtime.mapreduce.MRJobLauncher$TaskRunner.run(MRJobLauncher.java:546)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)

at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

WARN [AbstractJobLauncher] Not committing dataset of job job_GobblinKafkaQuickStartMR_1452060688329 with commit policy COMMIT_ON_FULL_SUCCESS and state FAILED

ERROR [AbstractJobLauncher] Failed to launch and run job job_GobblinKafkaQuickStartMR_1452060688329: java.io.IOException: Failed to commit dataset state for some dataset(s) of job job_GobblinKafkaQuickStartMR_1452060688329

java.io.IOException: Failed to commit dataset state for some dataset(s) of job job_GobblinKafkaQuickStartMR_1452060688329

at gobblin.runtime.JobContext.commit(JobContext.java:346)

at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:258)

at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:60)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)

at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:133)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Failed to launch the job due to the following exception:

gobblin.runtime.JobException: Job job_GobblinKafkaQuickStartMR_1452060688329 failed

Prashant Bhardwaj

unread,

Jan 6, 2016, 4:50:35 AM1/6/16

to gobblin-users, vamsi....@gmail.com

This is not complete log. Please post yarn logs for better understanding. For accessing yarn logs use "yarn logs -applicationId <application ID>".

Message has been deleted

Vamsikrushna L

unread,

Jan 6, 2016, 6:29:43 AM1/6/16

to gobblin-users, vamsi....@gmail.com

I could able to fix the problem.

It is working fine.

On Wednesday, January 6, 2016 at 3:44:23 PM UTC+5:30, Vamsikrushna L wrote:

Hi Prashant,

Thanks a lot for your response.

PFA log.

Thanks and regards,
Vamsi.

Bala Kasaram

unread,

May 27, 2016, 3:19:57 AM5/27/16

to gobblin-users, vamsi....@gmail.com

What was the problem, I am facing same kind of issue. can you tell me how you fixed this?

Sahil Takiar

unread,

May 27, 2016, 12:29:52 PM5/27/16

to Bala Kasaram, gobblin-users, Vamsikrushna L

Take a look at http://gobblin.readthedocs.io/en/latest/user-guide/FAQs/#how-do-i-fix-unsupportedfilesystemexception-no-abstractfilesystem-for-scheme-null

To view this discussion on the web visit https://groups.google.com/d/msgid/gobblin-users/334ceca4-9172-457a-b201-6a5d36bd2d4a%40googlegroups.com.

Reply all

Reply to author

Forward