Remote Hadoop Provisioner

144 views
Skip to first unread message

Vahid Rad

unread,
Nov 6, 2021, 7:03:43 AM11/6/21
to CDAP User
hi guys
I want to connect to the Remote Hadoop Provisioner. When I make the settings and select compute profile in the pipeline setting, gets an error


Error log is :
2021-11-06 12:42:52,278 - DEBUG [provisioning-task-2:i.c.c.i.p.t.ProvisioningTask@125] - Executing PROVISION subtask REQUESTING_CREATE for program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.b7402204-3ee1-11ec-852c-2c768a554ecd.
2021-11-06 12:42:52,295 - DEBUG [provisioning-task-2:i.c.c.i.p.t.ProvisioningTask@129] - Completed PROVISION subtask REQUESTING_CREATE for program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.b7402204-3ee1-11ec-852c-2c768a554ecd.
2021-11-06 12:42:52,467 - DEBUG [provisioning-task-0:i.c.c.i.p.t.ProvisioningTask@116] - Completed PROVISION task for program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.b7402204-3ee1-11ec-852c-2c768a554ecd.
2021-11-06 12:42:55,412 - INFO  [program-start-3:i.c.c.i.a.r.d.DistributedProgramRunner@479] - Starting Workflow Program 'DataPipelineWorkflow' with Arguments [logical.start.time=1636189970464, system.profile.name=SYSTEM:vahid10], with debugging false
2021-11-06 12:42:55,486 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.RemoteExecutionTwillPreparer@249] - Create and copy launcher.jar
2021-11-06 12:42:55,496 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.RemoteExecutionTwillPreparer@278] - Done launcher.jar
2021-11-06 12:42:55,505 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.RemoteExecutionTwillPreparer@232] - Create and copy twill.jar
2021-11-06 12:42:56,640 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.RemoteExecutionTwillPreparer@240] - Done twill.jar
2021-11-06 12:42:56,644 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@521] - Create and copy application.jar
2021-11-06 12:43:48,571 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@529] - Done application.jar
2021-11-06 12:43:48,574 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@542] - Create and copy resources.jar
2021-11-06 12:43:48,605 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@545] - Done resources.jar
2021-11-06 12:43:48,607 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@585] - Populating Runnable LocalFiles
2021-11-06 12:43:48,609 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1636189974560-0/cConf.xml
2021-11-06 12:43:48,610 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/conf/logback-container.xml
2021-11-06 12:43:48,610 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1636189974560-0/1636189974782-0/artifacts.jar
2021-11-06 12:43:48,611 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1636189974560-0/appSpec3153309792472349492.json
2021-11-06 12:43:48,611 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/namespaces/system/artifacts/cdap-data-pipeline/6.4.1.746bf979-4052-4ab4-aeca-eae2e4763138.jar
2021-11-06 12:43:48,611 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1636189974560-0/hConf.xml
2021-11-06 12:43:48,611 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1636189974560-0/1636189974782-0/artifacts.jar
2021-11-06 12:43:48,612 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1636189974560-0/program.options9160897069915741874.json
2021-11-06 12:43:48,612 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@595] - Done Runnable LocalFiles
2021-11-06 12:43:48,614 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@613] - Creating /root/cdap-sandbox-6.4.1/data/tmp/b7402204-3ee1-11ec-852c-2c768a554ecd2438488038000282515/runtime.config.jar508667471438943420/twillSpec.json
2021-11-06 12:43:48,632 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@633] - Done /root/cdap-sandbox-6.4.1/data/tmp/b7402204-3ee1-11ec-852c-2c768a554ecd2438488038000282515/runtime.config.jar508667471438943420/twillSpec.json
2021-11-06 12:43:48,634 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@644] - Creating /root/cdap-sandbox-6.4.1/data/tmp/b7402204-3ee1-11ec-852c-2c768a554ecd2438488038000282515/runtime.config.jar508667471438943420/logback-template.xml
2021-11-06 12:43:48,634 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@648] - Done /root/cdap-sandbox-6.4.1/data/tmp/b7402204-3ee1-11ec-852c-2c768a554ecd2438488038000282515/runtime.config.jar508667471438943420/logback-template.xml
2021-11-06 12:43:48,639 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@550] - Create and copy runtime.config.jar
2021-11-06 12:43:48,645 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@572] - Done runtime.config.jar
2021-11-06 12:43:48,954 - ERROR [runtime-scheduler-2:i.c.c.i.a.r.d.r.RemoteExecutionTwillRunnerService@527] - Fail to start program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.b7402204-3ee1-11ec-852c-2c768a554ecd
java.io.IOException: Failed to SSH to ro...@192.168.1.2:22
    at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:103) ~[na:na]
    at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionTwillPreparer.launch(RemoteExecutionTwillPreparer.java:117) ~[na:na]
    at io.cdap.cdap.internal.app.runtime.distributed.remote.AbstractRuntimeTwillPreparer.lambda$start$1(AbstractRuntimeTwillPreparer.java:466) ~[na:na]
    at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionTwillRunnerService$ControllerFactory.lambda$create$0(RemoteExecutionTwillRunnerService.java:503) ~[na:na]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_181]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_181]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_181]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[na:1.8.0_181]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_181]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_181]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_181]
Caused by: com.jcraft.jsch.JSchException: Auth fail
    at com.jcraft.jsch.Session.connect(Session.java:519) ~[com.jcraft.jsch-0.1.54.jar:na]
    at com.jcraft.jsch.Session.connect(Session.java:183) ~[com.jcraft.jsch-0.1.54.jar:na]
    at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:100) ~[na:na]
    ... 10 common frames omitted
2021-11-06 12:43:48,962 - WARN  [runtime-scheduler-2:i.c.c.i.a.r.d.r.RemoteExecutionTwillRunnerService@537] - Force termination of remote process for program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.b7402204-3ee1-11ec-852c-2c768a554ecd failed
java.io.IOException: Failed to SSH to ro...@192.168.1.2:22
    at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:103) ~[na:na]
    at io.cdap.cdap.internal.app.runtime.distributed.remote.SSHRemoteProcessController.killProcess(SSHRemoteProcessController.java:107) ~[na:na]
    at io.cdap.cdap.internal.app.runtime.distributed.remote.SSHRemoteProcessController.kill(SSHRemoteProcessController.java:102) ~[na:na]
    at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionTwillRunnerService$ControllerFactory.lambda$create$2(RemoteExecutionTwillRunnerService.java:535) ~[na:na]
    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) ~[na:1.8.0_181]
    at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) ~[na:1.8.0_181]
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[na:1.8.0_181]
    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[na:1.8.0_181]
    at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionTwillRunnerService$ControllerFactory.lambda$create$0(RemoteExecutionTwillRunnerService.java:505) ~[na:na]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_181]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_181]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_181]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[na:1.8.0_181]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_181]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_181]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_181]
Caused by: com.jcraft.jsch.JSchException: socket is not established
    at com.jcraft.jsch.Util.createSocket(Util.java:394) ~[com.jcraft.jsch-0.1.54.jar:na]
    at com.jcraft.jsch.Session.connect(Session.java:215) ~[com.jcraft.jsch-0.1.54.jar:na]
    at com.jcraft.jsch.Session.connect(Session.java:183) ~[com.jcraft.jsch-0.1.54.jar:na]
    at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:100) ~[na:na]
    ... 15 common frames omitted
2021-11-06 12:43:51,100 - DEBUG [provisioning-task-4:i.c.c.i.p.t.ProvisioningTask@125] - Executing DEPROVISION subtask REQUESTING_DELETE for program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.b7402204-3ee1-11ec-852c-2c768a554ecd.
2021-11-06 12:43:51,171 - WARN  [provisioning-task-4:i.c.c.r.s.p.r.RemoteHadoopProvisioner@148] - Unable to clean up resources for program DataPipelineWorkflow run b7402204-3ee1-11ec-852c-2c768a554ecd on the remote cluster. The run directory may need to be manually deleted on cluster node 192.168.1.2
java.io.IOException: Failed to SSH to ro...@192.168.1.2:22
    at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:103) ~[na:na]
    at io.cdap.cdap.internal.provision.DefaultSSHContext.createSSHSession(DefaultSSHContext.java:120) ~[na:na]
    at io.cdap.cdap.runtime.spi.ssh.SSHContext.createSSHSession(SSHContext.java:92) ~[na:na]
    at io.cdap.cdap.runtime.spi.ssh.SSHContext.createSSHSession(SSHContext.java:80) ~[na:na]
    at io.cdap.cdap.runtime.spi.provisioner.remote.RemoteHadoopProvisioner.createSSHSession(RemoteHadoopProvisioner.java:82) ~[na:na]
    at io.cdap.cdap.runtime.spi.provisioner.remote.RemoteHadoopProvisioner.deleteCluster(RemoteHadoopProvisioner.java:143) ~[na:na]
    at io.cdap.cdap.runtime.spi.provisioner.Provisioner.deleteClusterWithStatus(Provisioner.java:142) [na:na]
    at io.cdap.cdap.internal.provision.task.ClusterDeleteSubtask.execute(ClusterDeleteSubtask.java:42) [na:na]
    at io.cdap.cdap.internal.provision.task.ProvisioningSubtask.execute(ProvisioningSubtask.java:54) [na:na]
    at io.cdap.cdap.internal.provision.task.ProvisioningTask.lambda$executeOnce$1(ProvisioningTask.java:127) [na:na]
    at io.cdap.cdap.common.service.Retries.callWithRetries(Retries.java:185) ~[na:na]
    at io.cdap.cdap.common.service.Retries.callWithInterruptibleRetries(Retries.java:259) ~[na:na]
    at io.cdap.cdap.internal.provision.task.ProvisioningTask.executeOnce(ProvisioningTask.java:127) [na:na]
    at io.cdap.cdap.internal.provision.ProvisioningService.lambda$null$21(ProvisioningService.java:659) ~[na:na]
    at io.cdap.cdap.internal.provision.ProvisioningService.callWithProgramLogging(ProvisioningService.java:837) ~[na:na]
    at io.cdap.cdap.internal.provision.ProvisioningService.lambda$null$22(ProvisioningService.java:657) ~[na:na]
    at io.cdap.cdap.common.async.KeyedExecutor$2.run(KeyedExecutor.java:99) ~[na:na]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_181]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_181]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_181]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[na:1.8.0_181]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_181]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_181]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_181]
Caused by: com.jcraft.jsch.JSchException: Auth fail
    at com.jcraft.jsch.Session.connect(Session.java:519) ~[com.jcraft.jsch-0.1.54.jar:na]
    at com.jcraft.jsch.Session.connect(Session.java:183) ~[com.jcraft.jsch-0.1.54.jar:na]
    at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:100) ~[na:na]
    ... 23 common frames omitted
2021-11-06 12:43:51,172 - DEBUG [provisioning-task-4:i.c.c.i.p.t.ProvisioningTask@129] - Completed DEPROVISION subtask REQUESTING_DELETE for program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.b7402204-3ee1-11ec-852c-2c768a554ecd.
2021-11-06 12:43:51,250 - DEBUG [provisioning-task-2:i.c.c.i.p.t.ProvisioningTask@116] - Completed DEPROVISION task for program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.b7402204-3ee1-11ec-852c-2c768a554ecd.

asher 100

unread,
Nov 7, 2021, 4:43:04 PM11/7/21
to CDAP User
Hi,

It seems that the CDAP server doesn't connect your hadoop namenode server.
Could you connect Hadoop server by using the private key you input in the profile?
(by command '$ ssh -i <id_rsa> <username>@<your_hadoop_server>')

2021년 11월 6일 토요일 오후 8시 3분 43초 UTC+9에 v.fal...@gmail.com님이 작성:

BASANAGOWDA PATIL

unread,
Nov 10, 2021, 10:10:37 AM11/10/21
to CDAP User
Use the below command to generate ssh key
ssh-keygen -m PEM -t rsa -b 4096 -C "ba...@domain.com"
copy the public key to the file authorized_keys in edge node

Try below command to see if you are able to login
ssh -i ~/.ssh/id_rsa ba...@edge.domain.com

Configure below details and then try to connect
User : basan
sshKey :  copy the private key from the file /Users/basan/.ssh/id_rsa


Thanks,
Basan

Vahid Rad

unread,
Dec 8, 2021, 1:40:29 AM12/8/21
to CDAP User
Hi patilbasa
First : Thanks a lot for your attention and help
Second : please tell me what is edge node? its same as cdap server node ?
Look :
The servers connect to each other in the form of Passwordless (but after using your commands - )
and then, configure cdap compute again
But I get the same error again!!!!!!
My os : Centos7 (all servers)
cdap server name : s1
cdap server ip: 172.16.10.3
cloudera hadoop (name node) name : m1
cloudera hadoop (name node) ip : 172.16.10.2

The two servers are Passwordless connected together.

Please lead me further.

Vahid Rad

unread,
Dec 8, 2021, 5:09:41 AM12/8/21
to CDAP User
Hi Asher
Thanks for your help
yes, i can connect with this command ssh -i <id_rsa> <username>@<your_hadoop_server>

Please lead me further.
Message has been deleted

Vahid Rad

unread,
Dec 8, 2021, 7:07:47 AM12/8/21
to CDAP User
SSH LOG IS HERE:

Dec  8 14:12:54 m1 sshd[27484]: error: Received disconnect from 172.16.10.3 port 37766:3: com.jcraft.jsch.JSchException: Auth fail [preauth]
Dec  8 14:12:54 m1 sshd[27484]: Disconnected from  172.16.10.3 port 37766 [preauth]
Dec  8 14:12:54 m1 sshd[27488]: error: Received disconnect from  172.16.10.3 port 37770:3: com.jcraft.jsch.JSchException: Auth fail [preauth]
Dec  8 14:12:54 m1 sshd[27488]: Disconnected from  172.16.10.3 port 37770 [preauth]
Dec  8 14:12:55 m1 sshd[27494]: error: Received disconnect from  172.16.10.3 port 37778:3: com.jcraft.jsch.JSchException: Auth fail [preauth]
Dec  8 14:12:55 m1 sshd[27494]: Disconnected from  172.16.10.3 port 37778 [preauth]
Dec  8 14:12:55 m1 sshd[27497]: error: Received disconnect from  172.16.10.3 port 37780:3: com.jcraft.jsch.JSchException: Auth fail [preauth]
Dec  8 14:12:55 m1 sshd[27497]: Disconnected from  172.16.10.3 port 37780 [preauth]

Vahid Rad

unread,
Dec 13, 2021, 1:29:02 AM12/13/21
to CDAP User
My cdap pipline logs :

2021-12-13 09:46:03,222 - DEBUG [provisioning-task-6:i.c.c.i.p.t.ProvisioningTask@125] - Executing PROVISION subtask REQUESTING_CREATE for program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.25c89d8a-5bdc-11ec-90c5-2c768a554ecd. 2021-12-13 09:46:03,226 - DEBUG [provisioning-task-6:i.c.c.i.p.t.ProvisioningTask@129] - Completed PROVISION subtask REQUESTING_CREATE for program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.25c89d8a-5bdc-11ec-90c5-2c768a554ecd. 2021-12-13 09:46:03,348 - DEBUG [provisioning-task-6:i.c.c.i.p.t.ProvisioningTask@116] - Completed PROVISION task for program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.25c89d8a-5bdc-11ec-90c5-2c768a554ecd. 2021-12-13 09:46:06,001 - INFO [program-start-8:i.c.c.i.a.r.d.DistributedProgramRunner@479] - Starting Workflow Program 'DataPipelineWorkflow' with Arguments [logical.start.time=1639376162648, system.profile.name=USER:hajvahid], with debugging false 2021-12-13 09:46:06,024 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.RemoteExecutionTwillPreparer@249] - Create and copy launcher.jar 2021-12-13 09:46:06,025 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.RemoteExecutionTwillPreparer@278] - Done launcher.jar 2021-12-13 09:46:06,027 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.RemoteExecutionTwillPreparer@232] - Create and copy twill.jar 2021-12-13 09:46:06,027 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.RemoteExecutionTwillPreparer@240] - Done twill.jar 2021-12-13 09:46:06,028 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@521] - Create and copy application.jar 2021-12-13 09:46:06,028 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@529] - Done application.jar 2021-12-13 09:46:06,028 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@542] - Create and copy resources.jar 2021-12-13 09:46:06,043 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@545] - Done resources.jar 2021-12-13 09:46:06,043 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@585] - Populating Runnable LocalFiles 2021-12-13 09:46:06,043 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1639376165470-0/cConf.xml 2021-12-13 09:46:06,044 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/conf/logback-container.xml 2021-12-13 09:46:06,044 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1639376165470-0/1639376165501-0/artifacts.jar 2021-12-13 09:46:06,044 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1639376165470-0/appSpec1788042231698515657.json 2021-12-13 09:46:06,044 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/namespaces/system/artifacts/cdap-data-pipeline/6.4.1.746bf979-4052-4ab4-aeca-eae2e4763138.jar 2021-12-13 09:46:06,044 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1639376165470-0/hConf.xml 2021-12-13 09:46:06,044 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1639376165470-0/1639376165501-0/artifacts.jar 2021-12-13 09:46:06,045 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@592] - Added file file:/root/cdap-sandbox-6.4.1/data/tmp/1639376165470-0/program.options6623809234107657218.json 2021-12-13 09:46:06,045 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@595] - Done Runnable LocalFiles 2021-12-13 09:46:06,045 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@613] - Creating /root/cdap-sandbox-6.4.1/data/tmp/25c89d8a-5bdc-11ec-90c5-2c768a554ecd242575053672585158/runtime.config.jar2192106370460675325/twillSpec.json 2021-12-13 09:46:06,047 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@633] - Done /root/cdap-sandbox-6.4.1/data/tmp/25c89d8a-5bdc-11ec-90c5-2c768a554ecd242575053672585158/runtime.config.jar2192106370460675325/twillSpec.json 2021-12-13 09:46:06,048 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@644] - Creating /root/cdap-sandbox-6.4.1/data/tmp/25c89d8a-5bdc-11ec-90c5-2c768a554ecd242575053672585158/runtime.config.jar2192106370460675325/logback-template.xml 2021-12-13 09:46:06,058 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@648] - Done /root/cdap-sandbox-6.4.1/data/tmp/25c89d8a-5bdc-11ec-90c5-2c768a554ecd242575053672585158/runtime.config.jar2192106370460675325/logback-template.xml 2021-12-13 09:46:06,060 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@550] - Create and copy runtime.config.jar 2021-12-13 09:46:06,065 - DEBUG [runtime-scheduler-11:i.c.c.i.a.r.d.r.AbstractRuntimeTwillPreparer@572] - Done runtime.config.jar 2021-12-13 09:46:06,175 - ERROR [runtime-scheduler-11:i.c.c.i.a.r.d.r.RemoteExecutionTwillRunnerService@527] - Fail to start program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.25c89d8a-5bdc-11ec-90c5-2c768a554ecd java.io.IOException: Failed to SSH to ro...@172.16.10.2:22 at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:103) ~[na:na] at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionTwillPreparer.launch(RemoteExecutionTwillPreparer.java:117) ~[na:na] at io.cdap.cdap.internal.app.runtime.distributed.remote.AbstractRuntimeTwillPreparer.lambda$start$1(AbstractRuntimeTwillPreparer.java:466) ~[na:na] at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionTwillRunnerService$ControllerFactory.lambda$create$0(RemoteExecutionTwillRunnerService.java:503) ~[na:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_181] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_181] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_181] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_181] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_181] Caused by: com.jcraft.jsch.JSchException: Auth fail at com.jcraft.jsch.Session.connect(Session.java:519) ~[com.jcraft.jsch-0.1.54.jar:na] at com.jcraft.jsch.Session.connect(Session.java:183) ~[com.jcraft.jsch-0.1.54.jar:na] at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:100) ~[na:na] ... 10 common frames omitted 2021-12-13 09:46:06,185 - WARN [runtime-scheduler-11:i.c.c.i.a.r.d.r.RemoteExecutionTwillRunnerService@537] - Force termination of remote process for program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.25c89d8a-5bdc-11ec-90c5-2c768a554ecd failed java.io.IOException: Failed to SSH to ro...@172.16.10.2:22 at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:103) ~[na:na] at io.cdap.cdap.internal.app.runtime.distributed.remote.SSHRemoteProcessController.killProcess(SSHRemoteProcessController.java:107) ~[na:na] at io.cdap.cdap.internal.app.runtime.distributed.remote.SSHRemoteProcessController.kill(SSHRemoteProcessController.java:102) ~[na:na] at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionTwillRunnerService$ControllerFactory.lambda$create$2(RemoteExecutionTwillRunnerService.java:535) ~[na:na] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) ~[na:1.8.0_181] at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) ~[na:1.8.0_181] at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) ~[na:1.8.0_181] at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) ~[na:1.8.0_181] at io.cdap.cdap.internal.app.runtime.distributed.remote.RemoteExecutionTwillRunnerService$ControllerFactory.lambda$create$0(RemoteExecutionTwillRunnerService.java:505) ~[na:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_181] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_181] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_181] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_181] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_181] Caused by: com.jcraft.jsch.JSchException: socket is not established at com.jcraft.jsch.Util.createSocket(Util.java:394) ~[com.jcraft.jsch-0.1.54.jar:na] at com.jcraft.jsch.Session.connect(Session.java:215) ~[com.jcraft.jsch-0.1.54.jar:na] at com.jcraft.jsch.Session.connect(Session.java:183) ~[com.jcraft.jsch-0.1.54.jar:na] at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:100) ~[na:na] ... 15 common frames omitted 2021-12-13 09:46:07,564 - WARN [provisioning-task-8:i.c.c.r.s.p.r.RemoteHadoopProvisioner@148] - Unable to clean up resources for program DataPipelineWorkflow run 25c89d8a-5bdc-11ec-90c5-2c768a554ecd on the remote cluster. The run directory may need to be manually deleted on cluster node 172.16.10.2. java.io.IOException: Failed to SSH to ro...@172.16.10.2:22 at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:103) ~[na:na] at io.cdap.cdap.internal.provision.DefaultSSHContext.createSSHSession(DefaultSSHContext.java:120) ~[na:na] at io.cdap.cdap.runtime.spi.ssh.SSHContext.createSSHSession(SSHContext.java:92) ~[na:na] at io.cdap.cdap.runtime.spi.ssh.SSHContext.createSSHSession(SSHContext.java:80) ~[na:na] at io.cdap.cdap.runtime.spi.provisioner.remote.RemoteHadoopProvisioner.createSSHSession(RemoteHadoopProvisioner.java:82) ~[na:na] at io.cdap.cdap.runtime.spi.provisioner.remote.RemoteHadoopProvisioner.deleteCluster(RemoteHadoopProvisioner.java:143) ~[na:na] at io.cdap.cdap.runtime.spi.provisioner.Provisioner.deleteClusterWithStatus(Provisioner.java:142) [na:na] at io.cdap.cdap.internal.provision.task.ClusterDeleteSubtask.execute(ClusterDeleteSubtask.java:42) [na:na] at io.cdap.cdap.internal.provision.task.ProvisioningSubtask.execute(ProvisioningSubtask.java:54) [na:na] at io.cdap.cdap.internal.provision.task.ProvisioningTask.lambda$executeOnce$1(ProvisioningTask.java:127) [na:na] at io.cdap.cdap.common.service.Retries.callWithRetries(Retries.java:185) ~[na:na] at io.cdap.cdap.common.service.Retries.callWithInterruptibleRetries(Retries.java:259) ~[na:na] at io.cdap.cdap.internal.provision.task.ProvisioningTask.executeOnce(ProvisioningTask.java:127) [na:na] at io.cdap.cdap.internal.provision.ProvisioningService.lambda$null$21(ProvisioningService.java:659) ~[na:na] at io.cdap.cdap.internal.provision.ProvisioningService.callWithProgramLogging(ProvisioningService.java:837) ~[na:na] at io.cdap.cdap.internal.provision.ProvisioningService.lambda$null$22(ProvisioningService.java:657) ~[na:na] at io.cdap.cdap.common.async.KeyedExecutor$2.run(KeyedExecutor.java:99) ~[na:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_181] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_181] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_181] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_181] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_181] Caused by: com.jcraft.jsch.JSchException: Auth fail at com.jcraft.jsch.Session.connect(Session.java:519) ~[com.jcraft.jsch-0.1.54.jar:na] at com.jcraft.jsch.Session.connect(Session.java:183) ~[com.jcraft.jsch-0.1.54.jar:na] at io.cdap.cdap.common.ssh.DefaultSSHSession.<init>(DefaultSSHSession.java:100) ~[na:na] ... 23 common frames omitted 2021-12-13 09:46:07,702 - DEBUG [provisioning-task-8:i.c.c.i.p.t.ProvisioningTask@116] - Completed DEPROVISION task for program run program_run:default.test.-SNAPSHOT.workflow.DataPipelineWorkflow.25c89d8a-5bdc-11ec-90c5-2c768a554ecd.

Vahid Rad

unread,
Jan 10, 2022, 5:38:36 AM1/10/22
to CDAP User
No one can help solve the problem ????
Message has been deleted

Vahid Rad

unread,
Jan 11, 2022, 1:35:34 AM1/11/22
to CDAP User
I Cant connect with this command (Asher command) from cdap server to hadoop server
ssh -i <id_rsa> <username>@<your_hadoop_server>

so i can coonect with Patilbasa command
ssh -i ~/.ssh/id_rsa <username>@<your_hadoop_server>

All servers are password-less together
in cdap ui and compute profile section, copy id-rsa in private key field
but i get a different error :

2022-01-11 09:49:08,627 - DEBUG [runtime-scheduler-1:i.c.c.i.a.r.d.r.RemoteExecutionTwillPreparer@204] - Upload file file:/root/cdap-sandbox-6.5.1/data/tmp/runner.cache7090937319042871372/twill.jar to root@/172.16.10.28:22:/root/5c5b7d1f-72a6-11ec-877b-000c29e3d00b/.localized/3bc9b8c9737316e814a94387df393c2d-twill.jar

2022-01-11 09:49:12,107 - DEBUG [runtime-scheduler-1:i.c.c.i.a.r.d.r.RemoteExecutionTwillPreparer@215] - Expanding archive /root/5c5b7d1f-72a6-11ec-877b-000c29e3d00b/.localized/3bc9b8c9737316e814a94387df393c2d-twill.jar on host m.bigdata.local to /root/5c5b7d1f-72a6-11ec-877b-000c29e3d00b/twill.jar

2022-01-11 09:49:15,321 - ERROR [runtime-scheduler-1:i.c.c.i.a.r.d.r.RemoteExecutionTwillRunnerService@569] - Fail to start program run program_run:default.rad.-SNAPSHOT.workflow.DataPipelineWorkflow.5c5b7d1f-72a6-11ec-877b-000c29e3d00b

java.io.IOException: Commands execution failed with exit code (127) Commands: [mkdir -p /root/5c5b7d1f-72a6-11ec-877b-000c29e3d00b/twill.jar, cd /root/5c5b7d1f-72a6-11ec-877b-000c29e3d00b/twill.jar, jar xf /root/5c5b7d1f-72a6-11ec-877b-000c29e3d00b/.localized/3bc9b8c9737316e814a94387df393c2d-twill.jar], Output: Error: bash: jar: command not found
Reply all
Reply to author
Forward
0 new messages