CDAP Pipeline Twill Program can not start

134 views
Skip to first unread message

tomy tok

unread,
Mar 11, 2022, 2:23:38 AM3/11/22
to CDAP User
Hi Everyone.

I want cdap system on the k8s. i learned about the cdap system with documentation, so i finished cdap & cdap pipeline. but i have a problem. i can't start pipeline. because twill program is force stop. I don't know why problem occurs. 
In the error log, It semms to be the problem of Apache Twill. Someone please help me. 

-Logs:
2022-03-11 15:10:10,041 - INFO [main:i.c.c.i.a.r.d.AbstractTwillProgramController@73] - Twill program terminated: program_run:default2.default_v1.-SNAPSHOT.workflow.DataPipelineWorkflow.c2f8fea7-a101-11ec-9dcb-1e3c233a7a10, twill runId: f0971a43-6cb7-4013-8de7-7fa093b2d8e2, status: null 2022-03-11 15:10:10,091 - ERROR [pcontroller-program:default2.default_v1.-SNAPSHOT.workflow.DataPipelineWorkflow-c2f8fea7-a101-11ec-9dcb-1e3c233a7a10-1:i.c.c.i.a.r.AbstractProgramController@383] - Exception while executing method 'init' on listener io.cdap.cdap.internal.app.runtime.distributed.runtimejob.DefaultRuntimeJob$1@165333eb with executor org.apache.twill.common.Threads$1@10f9e461. java.lang.NullPointerException: null at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1988) ~[na:1.8.0_242] at io.cdap.cdap.internal.app.runtime.distributed.runtimejob.DefaultRuntimeJob$1.error(DefaultRuntimeJob.java:219) ~[na:na] at io.cdap.cdap.internal.app.runtime.AbstractListener.init(AbstractListener.java:41) ~[na:na] at io.cdap.cdap.internal.app.runtime.AbstractProgramController$ListenerCaller.lambda$init$0(AbstractProgramController.java:389) [na:na] at org.apache.twill.common.Threads$1.execute(Threads.java:35) ~[org.apache.twill.twill-common-0.14.0.jar:0.14.0] at io.cdap.cdap.internal.app.runtime.AbstractProgramController$ListenerCaller.execute(AbstractProgramController.java:379) [na:na] at io.cdap.cdap.internal.app.runtime.AbstractProgramController$ListenerCaller.init(AbstractProgramController.java:389) [na:na] at io.cdap.cdap.internal.app.runtime.AbstractProgramController.lambda$addListener$6(AbstractProgramController.java:210) [na:na] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_242] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_242] 2022-03-11 15:10:25,975 - DEBUG [RuntimeClientService:i.c.c.i.a.r.m.RuntimeClientService@117] - Program program_run:default2.default_v1.-SNAPSHOT.workflow.DataPipelineWorkflow.c2f8fea7-a101-11ec-9dcb-1e3c233a7a10 terminated. Shutting down runtime client service.



-cdap-cdap-app-fabric-0 pods log


2022-03-11 15:37:05,418 - DEBUG [runtime-scheduler-2:i.c.c.i.a.r.d.r.SSHRemoteExecutionService@99] - Service proxy file uploaded to remote runtime for program run program_run:default2.default_v1.-SNAPSHOT.workflow.DataPipelineWorkflow.99f80748-a105-11ec-8003-1e3c233a7a10
2022-03-11 15:37:09,836 - INFO  [program-start-7:i.c.c.i.a.r.d.AbstractTwillProgramController@67] - Twill program running: program_run:default2.default_v1.-SNAPSHOT.workflow.DataPipelineWorkflow.99f80748-a105-11ec-8003-1e3c233a7a10, twill runId: 99f80748-a105-11ec-8003-1e3c233a7a10
2022-03-11 15:37:49,004 - DEBUG [runtime-scheduler-5:i.c.c.i.a.r.d.r.SSHRemoteExecutionService@72] - Stopped ssh service for run program_run:default2.default_v1.-SNAPSHOT.workflow.DataPipelineWorkflow.99f80748-a105-11ec-8003-1e3c233a7a10
2022-03-11 15:37:49,005 - INFO  [runtime-scheduler-5:i.c.c.i.a.r.d.AbstractTwillProgramController@73] - Twill program terminated: program_run:default2.default_v1.-SNAPSHOT.workflow.DataPipelineWorkflow.99f80748-a105-11ec-8003-1e3c233a7a10, twill runId: 99f80748-a105-11ec-8003-1e3c233a7a10, status: SUCCEEDED
2022-03-11 15:37:49,006 - DEBUG [pcontroller-program:default2.default_v1.-SNAPSHOT.workflow.DataPipelineWorkflow-99f80748-a105-11ec-8003-1e3c233a7a10-2:i.c.c.a.r.AbstractProgramRuntimeService@564] - RuntimeInfo removed: program_run:default2.default_v1.-SNAPSHOT.workflow.DataPipelineWorkflow.99f80748-a105-11ec-8003-1e3c233a7a10
2022-03-11 15:37:54,690 - DEBUG [program.status:i.c.c.i.a.r.d.r.RemoteExecutionTwillController@137] - Force termination of remote process for program run program_run:default2.default_v1.-SNAPSHOT.workflow.DataPipelineWorkflow.99f80748-a105-11ec-8003-1e3c233a7a10

I'm using the cdap-sandbox-6.5.1 and viewfs hadoop system.

Thank you.




Sagar Kapare

unread,
Mar 15, 2022, 2:28:53 PM3/15/22
to CDAP User
Hi,

Can you please confirm if CDAP on k8s is installed using following this doc - https://cdap.atlassian.net/wiki/spaces/DOCS/pages/911179793/Installing+CDAP+on+Kubernetes and also make sure all system services are running fine? CDAP UI has service dashboard which can be used to see if there are any errors. 

Have you also setup Remote hadoop provisioner for launching pipelines?  If yes can you attach complete pipeline log?

Thanks and Regards,
Sagar

tomy tok

unread,
Mar 15, 2022, 10:42:28 PM3/15/22
to CDAP User
Hi sagar,
I'm done everything, because this system work fine on the other hdfs system with k8s, but only it is not work on federation hdfs system with k8s, so i don't why is not work.Is this the dashboard you mentioned?
dashboard.png
and it is full error log.
Sensitive information has been replaced

Thanks.

2022년 3월 16일 수요일 오전 3시 28분 53초 UTC+9에 Sagar Kapare님이 작성:
cdap-log.log
Reply all
Reply to author
Forward
0 new messages