Now gobblin work in map-reduce mode but unfortunately i could not get my xml files.
Here the content of gobblin-current.log :
2016-02-25 12:09:42 CET INFO [main] org.apache.hadoop.conf.Configuration 996 - mapreduce.user.classpath.first is deprecated. Instead, use mapreduce.job.user.classpath.first
2016-02-25 12:09:42 CET WARN [main] gobblin.runtime.JobContext 242 - Property task.data.root.dir is missing.
2016-02-25 12:09:42 CET INFO [main] gobblin.util.ClustersNames 73 - no default cluster mapping found
2016-02-25 12:09:42 CET INFO [main] org.apache.hadoop.conf.Configuration 996 - mapred.max.map.failures.percent is deprecated. Instead, use mapreduce.map.failures.maxpercent
2016-02-25 12:09:43 CET INFO [main] gobblin.metrics.GobblinMetrics 481 - Not reporting metrics to JMX
2016-02-25 12:09:43 CET INFO [main] gobblin.metrics.GobblinMetrics 430 - Not reporting metrics to log files
2016-02-25 12:09:43 CET INFO [main] gobblin.metrics.GobblinMetrics 492 - Not reporting metrics to Kafka
2016-02-25 12:09:43 CET INFO [main] gobblin.util.ExecutorsUtils 125 - Attempting to shutdown ExecutorService: java.util.concurrent.ThreadPoolExecutor@4ef27d66[Shutting down, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 1]
2016-02-25 12:09:43 CET INFO [main] gobblin.util.ExecutorsUtils 144 - Successfully shutdown ExecutorService: java.util.concurrent.ThreadPoolExecutor@4ef27d66[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
2016-02-25 12:09:43 CET WARN [main] gobblin.password.PasswordManager 189 - Property encrypt.key.loc not set. Cannot decrypt any encrypted password.
2016-02-25 12:09:43 CET WARN [main] gobblin.password.PasswordManager 189 - Property encrypt.key.loc not set. Cannot decrypt any encrypted password.
2016-02-25 12:09:43 CET INFO [main] gobblin.source.extractor.extract.sftp.SftpFsHelper 147 - Attempting to connect to source via SFTP with privateKey: ~/.ssh/id_rsa knownHosts: null userName: ftpuser hostName: 134.106.13.145 port: 21 proxyHost: null proxyPort: -1
2016-02-25 12:09:43 CET INFO [main] gobblin.source.extractor.extract.sftp.SftpFsHelper$LocalFileIdentityStrategy 433 - Successfully set identity using local file ~/.ssh/id_rsa
2016-02-25 12:09:43 CET INFO [main] gobblin.source.extractor.extract.sftp.SftpFsHelper 171 - Known hosts path is not set, StrictHostKeyChecking will be turned off
2016-02-25 12:09:43 CET INFO [main] gobblin.source.extractor.extract.sftp.SftpFsHelper$JSchLogger 335 - Connecting to 134.106.13.145 port 21
2016-02-25 12:09:43 CET INFO [main] gobblin.source.extractor.extract.sftp.SftpFsHelper$JSchLogger 335 - Connection established
2016-02-25 12:14:43 CET INFO [main] gobblin.source.extractor.extract.sftp.SftpFsHelper$JSchLogger 335 - Disconnecting from 134.106.13.145 port 21
2016-02-25 12:14:43 CET ERROR [main] gobblin.source.extractor.extract.sftp.SftpFsHelper 195 - connection is closed by foreign host
com.jcraft.jsch.JSchException: connection is closed by foreign host
at com.jcraft.jsch.Session.connect(Session.java:269)
at com.jcraft.jsch.Session.connect(Session.java:183)
at gobblin.source.extractor.extract.sftp.SftpFsHelper.connect(SftpFsHelper.java:188)
at gobblin.source.extractor.extract.sftp.SftpLightWeightFileSystem.initialize(SftpLightWeightFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2316)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:366)
at gobblin.data.management.copy.CopySource.getSourceFileSystem(CopySource.java:245)
at gobblin.data.management.copy.CloseableFsCopySource.getSourceFileSystem(CloseableFsCopySource.java:55)
at gobblin.data.management.copy.CopySource.getWorkunits(CopySource.java:108)
at gobblin.runtime.SourceDecorator.getWorkunits(SourceDecorator.java:52)
at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:239)
at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:50)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:77)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
2016-02-25 12:14:43 CET ERROR [main] gobblin.runtime.SourceDecorator 59 - Failed to get work units for job job_SftpDistcp_1456398582443
java.lang.RuntimeException: java.io.IOException: gobblin.source.extractor.filebased.FileBasedHelperException: Cannot connect to SFTP source
at gobblin.data.management.copy.CopySource.getWorkunits(CopySource.java:148)
at gobblin.runtime.SourceDecorator.getWorkunits(SourceDecorator.java:52)
at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:239)
at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:50)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:77)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.IOException: gobblin.source.extractor.filebased.FileBasedHelperException: Cannot connect to SFTP source
at gobblin.source.extractor.extract.sftp.SftpLightWeightFileSystem.initialize(SftpLightWeightFileSystem.java:91)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2316)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:366)
at gobblin.data.management.copy.CopySource.getSourceFileSystem(CopySource.java:245)
at gobblin.data.management.copy.CloseableFsCopySource.getSourceFileSystem(CloseableFsCopySource.java:55)
at gobblin.data.management.copy.CopySource.getWorkunits(CopySource.java:108)
... 10 more
Caused by: gobblin.source.extractor.filebased.FileBasedHelperException: Cannot connect to SFTP source
at gobblin.source.extractor.extract.sftp.SftpFsHelper.connect(SftpFsHelper.java:196)
at gobblin.source.extractor.extract.sftp.SftpLightWeightFileSystem.initialize(SftpLightWeightFileSystem.java:89)
... 15 more
Caused by: com.jcraft.jsch.JSchException: connection is closed by foreign host
at com.jcraft.jsch.Session.connect(Session.java:269)
at com.jcraft.jsch.Session.connect(Session.java:183)
at gobblin.source.extractor.extract.sftp.SftpFsHelper.connect(SftpFsHelper.java:188)
... 16 more
2016-02-25 12:14:43 CET ERROR [main] gobblin.runtime.AbstractJobLauncher 307 - Failed to launch and run job job_SftpDistcp_1456398582443: gobblin.runtime.JobException: Failed to get work units for job job_SftpDistcp_1456398582443
gobblin.runtime.JobException: Failed to get work units for job job_SftpDistcp_1456398582443
at gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:246)
at gobblin.runtime.mapreduce.CliMRJobLauncher.run(CliMRJobLauncher.java:50)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at gobblin.runtime.mapreduce.CliMRJobLauncher.main(CliMRJobLauncher.java:77)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
2016-02-25 12:14:43 CET INFO [main] gobblin.util.ExecutorsUtils 125 - Attempting to shutdown ExecutorService: java.util.concurrent.ThreadPoolExecutor@24f360b2[Shutting down, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 1]
2016-02-25 12:14:43 CET INFO [main] gobblin.util.ExecutorsUtils 144 - Successfully shutdown ExecutorService: java.util.concurrent.ThreadPoolExecutor@24f360b2[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
2016-02-25 12:14:43 CET INFO [main] gobblin.util.ExecutorsUtils 125 - Attempting to shutdown ExecutorService: java.util.concurrent.ThreadPoolExecutor@4b21844c[Shutting down, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 1]
2016-02-25 12:14:43 CET INFO [main] gobblin.util.ExecutorsUtils 144 - Successfully shutdown ExecutorService: java.util.concurrent.ThreadPoolExecutor@4b21844c[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1]
2016-02-25 12:14:43 CET INFO [main] gobblin.runtime.mapreduce.MRJobLauncher 443 - Deleted working directory /usr/local/gobblin/SFTP_HDFS/work_dir/working/SftpDistcp
Please if someone could help
and here the content of my .pull file :
job.name=SftpDistcp
job.group=Distcp
job.description=Job to copy data from sftp to hdfs
# Source properties
source.filebased.fs.uri=sftp:///
134.106.13.145:21source.class=gobblin.data.management.copy.CloseableFsCopySource
source.conn.private.key=~/.ssh/id_rsa
source.conn.username=ftpuser
source.conn.host=134.106.13.145
source.conn.port=21
# Dataset properties
gobblin.dataset.pattern=/tmp/Gobblin-Test
# Publisher properties
#data.publisher.type=gobblin.data.management.copy.publisher.CopyDataPublisher
data.publisher.final.dir=/usr/local/hadoop_store/hdfs/datanode
# Writer properties
writer.builder.class=gobblin.data.management.copy.writer.FileAwareInputStreamDataWriterBuilder
-------
PS: to connect to the one need a password, that is why i have tried to add source.conn.password to my .pull file but nothing changed