gobblin on dataproc

Henrique G. Abreu

unread,

May 19, 2017, 3:02:11 PM5/19/17

to gobblin-users

I'm trying to run gobblin on Google Dataproc, but I'm facing quite a few issues. The last one that I'm stuck into is:

WARN [JobContext] Property task.data.root.dir is missing.
WARN [KafkaSource] Previous offset for partition mytopic:0 does not exist. This partition will start from the earliest offset: 0
WARN [KafkaSource] Previous offset for partition mytopic:1 does not exist. This partition will start from the earliest offset: 0
WARN [TaskStateCollectorService] No output task state files found in gobblin/kafka2gcs/job_kafka2gcs_1495211213990/output/job_kafka2gcs_1495211213990
WARN [SafeDatasetCommit] Not committing dataset of job job_kafka2gcs_1495211213990 with commit policy COMMIT_ON_FULL_SUCCESS and state FAILED
ERROR [JobContext] Iterator executor failure.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Not committing dataset of job job_kafka2gcs_1495211213990 with commit policy COMMIT_ON_FULL_SUCCESS and state FAILED

... stacktrace continues

I have no ideia why the "Iterator executor" failed. If someone has any insights, please help.

Here is my gobblin job:

job.name=kafka2gcs
job.group=gkafka2gcs
job.description=Gobblin job to read messages from Kafka and save as is on GCS
job.lock.enabled=false

kafka.brokers=mykafka:9092
topic.whitelist=mytopic
bootstrap.with.offset=earliest

source.class=gobblin.source.extractor.extract.kafka.KafkaDeserializerSource
kafka.deserializer.type=BYTE_ARRAY
extract.namespace=nskafka2gcs

writer.builder.class=gobblin.writer.SimpleDataWriterBuilder
writer.destination.type=HDFS
mr.job.max.mappers=2
writer.output.format=txt
data.publisher.type=gobblin.publisher.BaseDataPublisher
metrics.enabled=false

fs.uri=file:///.
writer.fs.uri=${fs.uri}
mr.job.root.dir=gobblin
writer.output.dir=${mr.job.root.dir}/out
writer.staging.dir=${mr.job.root.dir}/stg

fs.gs.project.id=my-test-project
data.publisher.fs.uri=gs://my-bucket
state.store.fs.uri=${data.publisher.fs.uri}
data.publisher.final.dir=gobblin/pub
state.store.dir=gobblin/state

These are the commands I issue for dataproc:

gcloud dataproc clusters create myspark \ 
  --image-version 1.1 \ 
  --master-machine-type n1-standard-4 \ 
  --master-boot-disk-size 10 \ 
  --num-workers 2 \ 
  --worker-machine-type n1-standard-4 \ 
  --worker-boot-disk-size 10 \ 
  --scopes 'https://www.googleapis.com/auth/cloud-platform' \ 
  --initialization-actions gs://my-bucket/gobblin-setup.sh

gcloud dataproc jobs submit hadoop --cluster=myspark --class gobblin.runtime.mapreduce.CliMRJobLauncher --properties mapreduce.job.user.classpath.first=true -- -sysconfig /opt/gobblin-dist/conf/gobblin-mapreduce.properties -jobconfig gs://my-bucket/gobblin-kafka-gcs.job

To setup dataproc with the goblins jars on the path I pass this "gobblin-setup.sh" (below) to setup the cluster:

#!/bin/bash 
 
sed -i 's%hdfs://myspark-m%file:///.%g' /usr/lib/hadoop/etc/hadoop/core-site.xml 
 
cd /opt 
gsutil cp gs://my-bucket/gobblin-distribution-0.10.0.tar.gz . 
tar xvf gobblin-distribution-0.10.0.tar.gz 
 
BASE="$PWD/gobblin-dist" 
HADOOP_LIB="/usr/lib/hadoop/lib" 
 
ROLE=$(/usr/share/google/get_metadata_value attributes/dataproc-role)
if [[ "${ROLE}" == 'Master' ]]; then 
  rm $HADOOP_LIB/commons-cli-1*.jar $HADOOP_LIB/guava-1*.jar 
  cp $BASE/lib/* $HADOOP_LIB 
  #*/ fix syntax highlight
else 
  rm $HADOOP_LIB/guava-1*.jar 
  GOBBLIN_VERSION="0.10.0" 
  FWDIR_LIB=$BASE/lib 
  #copied this from gobblin-dist/bin/gobblin-mapreduce.sh
  cp \ 
    $FWDIR_LIB/gobblin-api-$GOBBLIN_VERSION.jar \  
    $FWDIR_LIB/gobblin-avro-json-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-codecs-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-core-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-core-base-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-crypto-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-crypto-provider-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-data-management-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-metastore-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-metrics-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-metrics-base-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-metadata-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/gobblin-utility-$GOBBLIN_VERSION.jar \ 
    $FWDIR_LIB/avro-1.8.1.jar \ 
    $FWDIR_LIB/avro-mapred-1.8.1.jar \ 
    $FWDIR_LIB/commons-lang3-3.4.jar \ 
    $FWDIR_LIB/config-1.2.1.jar \ 
    $FWDIR_LIB/data-2.6.0.jar \ 
    $FWDIR_LIB/gson-2.6.2.jar \ 
    $FWDIR_LIB/guava-15.0.jar \ 
    $FWDIR_LIB/guava-retrying-2.0.0.jar \ 
    $FWDIR_LIB/joda-time-2.9.3.jar \ 
    $FWDIR_LIB/javassist-3.18.2-GA.jar \ 
    $FWDIR_LIB/kafka_2.11-0.8.2.2.jar \ 
    $FWDIR_LIB/kafka-clients-0.8.2.2.jar \ 
    $FWDIR_LIB/metrics-core-2.2.0.jar \ 
    $FWDIR_LIB/metrics-core-3.1.0.jar \ 
    $FWDIR_LIB/metrics-graphite-3.1.0.jar \ 
    $FWDIR_LIB/scala-library-2.11.8.jar \ 
    $FWDIR_LIB/influxdb-java-2.1.jar \ 
    $FWDIR_LIB/okhttp-2.4.0.jar \ 
    $FWDIR_LIB/okio-1.4.0.jar \ 
    $FWDIR_LIB/retrofit-1.9.0.jar \                                                                             
    $FWDIR_LIB/reflections-0.9.10.jar \ 
    $HADOOP_LIB 
fi

BTW, I'm working with gobblin 0.10.0 on dataproc 1.1 (hadoop 2.7.3).

Any ideas?

Issac Buenrostro

unread,

May 19, 2017, 3:53:48 PM5/19/17

to Henrique G. Abreu, gobblin-users

Can you provide the full log of the application?

--
You received this message because you are subscribed to the Google Groups "gobblin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gobblin-users+unsubscribe@googlegroups.com.
To post to this group, send email to gobbli...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gobblin-users/6305a694-c576-4c87-a6ff-c1a475c8f267%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Henrique Abreu

unread,

May 19, 2017, 5:24:50 PM5/19/17

to Issac Buenrostro, gobblin-users

Sure, this is what I got:

Job [9f744870-e283-42c3-a416-55605263199f] submitted.
Waiting for job output...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.21.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
WARN [AbstractJobLauncher] Creating a job specific SharedResourcesBroker. Objects will only be shared at the job level.

WARN [JobContext] Property task.data.root.dir is missing.

WARN [KafkaSource] Previous offset for partition of_sample:0 does not exist. This partition will start from the earliest offset: 0
WARN [KafkaSource] Previous offset for partition of_sample:1 does not exist. This partition will start from the earliest offset: 0
WARN [TaskStateCollectorService] No output task state files found in gobblin/kafka2gcs/job_kafka2gcs_1495220348066/output/job_kafka2gcs_1495220348066
WARN [SafeDatasetCommit] Not committing dataset of job job_kafka2gcs_1495220348066 with commit policy COMMIT_ON_FULL_SUCCESS and state FAILED

ERROR [JobContext] Iterator executor failure.

java.util.concurrent.ExecutionException: java.lang.RuntimeException: Not committing dataset of job job_kafka2gcs_1495220348066 with commit policy COMMIT_ON_FULL_SUCCESS and state FAILED
       at java.util.concurrent.FutureTask.report(FutureTask.java:122)
       at java.util.concurrent.FutureTask.get(FutureTask.java:192)
       at gobblin.util.executors.IteratorExecutor.executeAndGetResults(IteratorExecutor.java:128)
       at gobblin.runtime.JobContext.commit(JobContext.java:432)
       at gobblin.runtime.JobContext.commit(JobContext.java:398)
       at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:431)
       at gobblin.runtime.mapreduce.CliMRJobLauncher.launchJob(CliMRJobLauncher.java:89)
       at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:66)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
       at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:111)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at com.google.cloud.hadoop.services.agent.job.shim.HadoopRunClassShim.main(HadoopRunClassShim.java:19)
Caused by: java.lang.RuntimeException: Not committing dataset of job job_kafka2gcs_1495220348066 with commit policy COMMIT_ON_FULL_SUCCESS and state FAILED
       at gobblin.runtime.SafeDatasetCommit.call(SafeDatasetCommit.java:86)
       at gobblin.runtime.SafeDatasetCommit.call(SafeDatasetCommit.java:54)
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
       at gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:35)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
       at java.lang.Thread.run(Thread.java:745)
ERROR [AbstractJobLauncher] Failed to launch and run job job_kafka2gcs_1495220348066: java.io.IOException: Failed to commit dataset state for some dataset(s) of job job_kafka2gcs_1495220348066
java.io.IOException: Failed to commit dataset state for some dataset(s) of job job_kafka2gcs_1495220348066
       at gobblin.runtime.JobContext.commit(JobContext.java:438)
       at gobblin.runtime.JobContext.commit(JobContext.java:398)
       at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:431)
       at gobblin.runtime.mapreduce.CliMRJobLauncher.launchJob(CliMRJobLauncher.java:89)
       at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:66)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
       at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:111)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at com.google.cloud.hadoop.services.agent.job.shim.HadoopRunClassShim.main(HadoopRunClassShim.java:19)
Exception in thread "main" java.lang.reflect.InvocationTargetException
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at com.google.cloud.hadoop.services.agent.job.shim.HadoopRunClassShim.main(HadoopRunClassShim.java:19)
Caused by: gobblin.runtime.JobException: Job job_kafka2gcs_1495220348066 failed
       at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:480)
       at gobblin.runtime.mapreduce.CliMRJobLauncher.launchJob(CliMRJobLauncher.java:89)
       at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:66)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
       at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:111)
       ... 5 more
WARN [ServiceBasedAppLauncher] ApplicationLauncher has already stopped
ERROR: (gcloud.dataproc.jobs.submit.hadoop) Job [9f744870-e283-42c3-a416-55605263199f] entered state [ERROR] while waiting for [DONE]
.

Thanks,

Henrique G. Abreu

Issac Buenrostro

unread,

May 19, 2017, 5:28:40 PM5/19/17

to Henrique G. Abreu, gobblin-users

That is not the entire log unfortunately. Are you running in MR mode? If so, can you get the log of a failed mapper? If running in local mode, there should be other exceptions before this point.

Henrique Abreu

unread,

May 19, 2017, 6:02:14 PM5/19/17

to Issac Buenrostro, gobblin-users

I'm sorry, I don't really know how to get such log. I setup a `logdir` though and go this `gobblin-current.log`, attached. I hope this is it.

Yes, I'm trying to run in MR mode, Google Dataproc is a managed Hadoop cluster.

I "learned" how to invoke in MR mode by reading the `bin/gobblin-mapreduce.sh` and tried to pass the same parameters and settings to Dataproc.

Thanks a lot taking a look.

Henrique G. Abreu

gobblin-current.log

Issac Buenrostro

unread,

May 19, 2017, 6:07:00 PM5/19/17

to Henrique G. Abreu, gobblin-users

This is the error it throws:

2017-05-19 21:46:13 UTC INFO [main] org.apache.hadoop.mapreduce.Job 1380 - Job job_1495230148327_0001 failed with state FAILED due to: Application application_1495230148327_0001 failed 2 times due to AM Container for appattempt_1495230148327_0001_000002 exited with exitCode: -1000

For more detailed output, check application tracking page:http://myspark-m:8088/cluster/app/application_1495230148327_0001Then, click on links to logs of each attempt.

Diagnostics: java.io.FileNotFoundException: File file:/tmp/ffe94f06-2b51-44e2-b943-268cc58ee099/gobblin/kafka2gcs/job_kafka2gcs_1495230367575/job.state does not exist

Can you try to open http://myspark-m:8088/cluster/app/application_1495230148327_0001 and see if there are any logs in there?

To view this discussion on the web visit https://groups.google.com/d/msgid/gobblin-users/CAFGk%3De%2B-2Patgf5TEN2Tq6xY6FJDnqq99xemt5H5Extc7a-1Mg%40mail.gmail.com.

Henrique Abreu

unread,

May 19, 2017, 6:57:59 PM5/19/17

to Issac Buenrostro, gobblin-users

Unfortunately there isn't. It just says:

"No logs available for container container_1495230148327_0001_01_000001"

I'm changing my settings to see if I get any different or more verbose error, without much luck :-/

The best I got was, when commenting the `sed` line I use to edit some settings dataproc set on etc/hadoop/core-site.xml (which I don't have on my local hadoop setup, where gobblin does work fine on MR mode), I got this other error :-/

2017-05-19 22:37:26 UTC INFO [main] gobblin.kafka.client.Kafka08ConsumerClient 143 - Fetching topic metadata from broker mykafka:9092
2017-05-19 22:37:27 UTC INFO [main] gobblin.source.extractor.extract.kafka.KafkaSource 161 - Discovered topic of_sample
2017-05-19 22:37:27 UTC INFO [main] gobblin.util.ExecutorsUtils 186 - Attempting to shutdown ExecutorService: java.util.concurrent.ThreadPoolExecutor@7ab1127c[Shutting down, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 0]
2017-05-19 22:37:27 UTC WARN [pool-8-thread-1] gobblin.source.extractor.extract.kafka.KafkaSource 342 - Previous offset for partition of_sample:0 does not exist. This partition will start from the earliest offset: 0
2017-05-19 22:37:27 UTC INFO [pool-8-thread-1] gobblin.source.extractor.extract.kafka.KafkaSource 465 - Created workunit for partition of_sample:0: lowWatermark=0, highWatermark=0, range=0
2017-05-19 22:37:27 UTC WARN [pool-8-thread-1] gobblin.source.extractor.extract.kafka.KafkaSource 342 - Previous offset for partition of_sample:1 does not exist. This partition will start from the earliest offset: 0
2017-05-19 22:37:27 UTC INFO [pool-8-thread-1] gobblin.source.extractor.extract.kafka.KafkaSource 465 - Created workunit for partition of_sample:1: lowWatermark=0, highWatermark=0, range=0
2017-05-19 22:37:27 UTC INFO [main] gobblin.util.ExecutorsUtils 205 - Successfully shutdown ExecutorService: java.util.concurrent.ThreadPoolExecutor@7ab1127c[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
2017-05-19 22:37:27 UTC INFO [main] gobblin.source.extractor.extract.kafka.KafkaSource 185 - Created workunits for 1 topics in 0 seconds
2017-05-19 22:37:27 UTC INFO [main] gobblin.source.extractor.extract.kafka.workunit.packer.KafkaAvgRecordTimeBasedWorkUnitSizeEstimator 149 - For all topics not pulled in the previous run, estimated avg time to pull a record is 1.0 milliseconds
2017-05-19 22:37:27 UTC INFO [main] gobblin.source.extractor.extract.kafka.workunit.packer.KafkaWorkUnitPacker 237 - Created MultiWorkUnit for partitions [of_sample:0, of_sample:1]
2017-05-19 22:37:27 UTC INFO [main] gobblin.source.extractor.extract.kafka.workunit.packer.KafkaWorkUnitPacker 311 - MultiWorkUnit 0: estimated load=0.003010, partitions=[]
2017-05-19 22:37:27 UTC INFO [main] gobblin.source.extractor.extract.kafka.workunit.packer.KafkaWorkUnitPacker 311 - MultiWorkUnit 1: estimated load=0.003010, partitions=[[of_sample:0, of_sample:1]]
2017-05-19 22:37:27 UTC INFO [main] gobblin.source.extractor.extract.kafka.workunit.packer.KafkaWorkUnitPacker 297 - Min load of multiWorkUnit = 0.003010; Max load of multiWorkUnit = 0.003010; Diff = 0.000000%
2017-05-19 22:37:27 UTC INFO [main] gobblin.util.ExecutorsUtils 186 - Attempting to shutdown ExecutorService: com.google.common.util.concurrent.MoreExecutors$ListeningDecorator@3d299393
2017-05-19 22:37:27 UTC INFO [main] gobblin.util.ExecutorsUtils 205 - Successfully shutdown ExecutorService: com.google.common.util.concurrent.MoreExecutors$ListeningDecorator@3d299393
2017-05-19 22:37:27 UTC INFO [main] gobblin.runtime.AbstractJobLauncher 386 - Starting job job_kafka2gcs_1495233445721
2017-05-19 22:37:27 UTC INFO [main] gobblin.util.ExecutorsUtils 186 - Attempting to shutdown ExecutorService: java.util.concurrent.ThreadPoolExecutor@5dc769f9[Shutting down, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 1]
2017-05-19 22:37:27 UTC INFO [main] gobblin.util.ExecutorsUtils 205 - Successfully shutdown ExecutorService: java.util.concurrent.ThreadPoolExecutor@5dc769f9[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
2017-05-19 22:37:27 UTC INFO [main] gobblin.util.ExecutorsUtils 186 - Attempting to shutdown ExecutorService: com.google.common.util.concurrent.MoreExecutors$ListeningDecorator@17fede14
2017-05-19 22:37:27 UTC INFO [main] gobblin.util.ExecutorsUtils 205 - Successfully shutdown ExecutorService: com.google.common.util.concurrent.MoreExecutors$ListeningDecorator@17fede14
2017-05-19 22:37:27 UTC INFO [TaskStateCollectorService STARTING] gobblin.runtime.TaskStateCollectorService 97 - Starting the TaskStateCollectorService
2017-05-19 22:37:27 UTC INFO [main] gobblin.runtime.mapreduce.MRJobLauncher 229 - Launching Hadoop MR job Gobblin-kafka2gcs
2017-05-19 22:37:28 UTC INFO [main] org.apache.hadoop.yarn.client.RMProxy 98 - Connecting to ResourceManager at myspark-m/10.128.0.2:8032
2017-05-19 22:37:29 UTC INFO [main] org.apache.hadoop.mapreduce.JobSubmitter 249 - Cleaning up the staging area /tmp/hadoop-yarn/staging/root/.staging/job_1495233380584_0001
2017-05-19 22:37:29 UTC INFO [TaskStateCollectorService STOPPING] gobblin.runtime.TaskStateCollectorService 103 - Stopping the TaskStateCollectorService
2017-05-19 22:37:29 UTC WARN [TaskStateCollectorService STOPPING] gobblin.runtime.TaskStateCollectorService 131 - No output task state files found in gobblin/kafka2gcs/job_kafka2gcs_1495233445721/output/job_kafka2gcs_1495233445721
2017-05-19 22:37:29 UTC INFO [main] gobblin.runtime.mapreduce.MRJobLauncher 505 - Deleted working directory gobblin/kafka2gcs/job_kafka2gcs_1495233445721
2017-05-19 22:37:29 UTC ERROR [main] gobblin.runtime.AbstractJobLauncher 442 - Failed to launch and run job job_kafka2gcs_1495233445721: java.io.FileNotFoundException: File does not exist: hdfs://myspark-m/user/root/gobblin/kafka2gcs/job_kafka2gcs_1495233445721/job.state
java.io.FileNotFoundException: File does not exist: hdfs://myspark-m/user/root/gobblin/kafka2gcs/job_kafka2gcs_1495233445721/job.state
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

Henrique G. Abreu

To view this discussion on the web visit https://groups.google.com/d/msgid/gobblin-users/CAMZ-pYBX%2BzDVP14r0NpKqhvosNSHUXy%3DFLpT%3DPh2Lw%2Bo_dHb_Q%40mail.gmail.com.

Dennis Huo

unread,

May 19, 2017, 7:14:24 PM5/19/17

to gobblin-users, ibue...@linkedin.com, hga...@gmail.com

Whenever there are filesystem-related errors, it's often when a framework makes assumptions about the fs.defaultFS in some way such that you might've specified some input/output path in GCS and the framework stages some files based on the output location, but then later looks for it with an explicit hdfs:// path.

To help isolate whether the error is related to this, can you try re-running your job after first placing your data into the cluster's HDFS and try using pure HDFS paths for input/output in your job?

You can either use "hadoop distcp gs://yourlocation/data/ hdfs:///some/hdfs/location" if the data is large or just do a "hadoop fs -cp" if it's small.

Also, is it any different if you SSH into your cluster and launch the job purely from the command line without using the Dataproc job submission API?

Henrique G. Abreu

To unsubscribe from this group and stop receiving emails from it, send an email to gobblin-user...@googlegroups.com.

To post to this group, send email to gobbli...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gobblin-users/6305a694-c576-4c87-a6ff-c1a475c8f267%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "gobblin-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to gobblin-user...@googlegroups.com.

To post to this group, send email to gobbli...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gobblin-users/CAFGk%3De%2B-2Patgf5TEN2Tq6xY6FJDnqq99xemt5H5Extc7a-1Mg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "gobblin-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to gobblin-user...@googlegroups.com.

Dennis Huo

unread,

May 19, 2017, 7:27:22 PM5/19/17

to gobblin-users, ibue...@linkedin.com, hga...@gmail.com

It also looks like your job configuration references "file:///" in some places; "file:///" means Hadoop's LocalFileSystem interface, meaning the master node will think it's talking about the master node's local filesystem, while workers will think it's talking about their own local filesystems.

Generally file:/// only works when you run in a single-machine test mode; any distributed cluster where workers are actually on separate nodes will require specifying either an hdfs:/// path for all the fs.uri settings or a gs:// directory.

Henrique G. Abreu

--master-boot<span sty

Henrique G. Abreu

unread,

May 21, 2017, 12:23:50 PM5/21/17

to gobblin-users, ibue...@linkedin.com, hga...@gmail.com

Thanks a lot for the help guys.

This was it, using file:/// worked ok on my laptop because it was a single node, but failed on dataproc.

Using only hdfs (for staging) and gs (for state and final output) instead, fixed this issue.

Which led me to a few other classpath issues, but I was finally able to run it successfully after I sorted that out.

I'll send a last email tomorrow with my final working configuration just for reference if a future novice, like myself, stumbles on this thread :-)

Again, thank you very much.

Shirshanka Das

unread,

Jun 16, 2017, 6:29:20 PM6/16/17

to Henrique G. Abreu, gobblin-users, Issac Buenrostro

Hi Henrique,

Could you post your final working configuration here for reference?

thanks!

Shirshanka

--

You received this message because you are subscribed to the Google Groups "gobblin-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to gobblin-users+unsubscribe@googlegroups.com.

To post to this group, send email to gobbli...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/gobblin-users/79a83d93-f8c2-4a7b-a71f-ef3b8e01f5e9%40googlegroups.com.

Reply all

Reply to author

Forward