cdap-operator

615 views
Skip to first unread message

Miraj Godha

unread,
May 22, 2019, 8:00:39 AM5/22/19
to CDAP User
While trying to follow the cdap-operator steps as mentioned here: https://github.com/cdapio/cdap-operator

I am getting below issue, any pointers where to look into?


[root@dopsind366-vm-03 cdap-operator]# make run
go generate ./pkg/... ./cmd/...
warning: GOPATH set to GOROOT (/usr/local/go) has no effect
warning: GOPATH set to GOROOT (/usr/local/go) has no effect
go fmt ./pkg/... ./cmd/...
warning: GOPATH set to GOROOT (/usr/local/go) has no effect
go vet ./pkg/... ./cmd/...
warning: GOPATH set to GOROOT (/usr/local/go) has no effect
# runtime/cgo
exec: "gcc": executable file not found in $PATH
make: *** [vet] Error 2

Nitin Motgi

unread,
May 22, 2019, 8:08:21 AM5/22/19
to cdap...@googlegroups.com
Looks like Go compiler is not in path. 

N


--
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap-user+...@googlegroups.com.
To post to this group, send email to cdap...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cdap-user/b025ca43-b2b2-4627-9723-8a058875443b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Miraj Godha

unread,
May 22, 2019, 8:18:36 AM5/22/19
to CDAP User
It looks to be in path..



[root@dopsind366-vm-03 hello]# vi hello.go
[root@dopsind366-vm-03 hello]# cd $HOME/go/src/hello
[root@dopsind366-vm-03 hello]# go build

warning: GOPATH set to GOROOT (/usr/local/go) has no effect
[root@dopsind366-vm-03 hello]# ./hello
hello, world
[root@dopsind366-vm-03 hello]# go
Go is a tool for managing Go source code.

Usage:

go <command> [arguments]

The commands are:

bug         start a bug report
build       compile packages and dependencies
clean       remove object files and cached files
doc         show documentation for package or symbol
env         print Go environment information
fix         update packages to use new APIs
fmt         gofmt (reformat) package sources
generate    generate Go files by processing source
get         download and install packages and dependencies
install     compile and install packages and dependencies
list        list packages or modules
mod         module maintenance
run         compile and run Go program
test        test packages
tool        run specified go tool
version     print Go version
vet         report likely mistakes in packages


Regards,
Miraj


Terence Yim

unread,
May 22, 2019, 8:35:52 AM5/22/19
to cdap...@googlegroups.com
Hi,

I think is due to this:

exec: "gcc": executable file not found in $PATH
make: *** [vet] Error 2

Have you follow the GO setup as in https://golang.org/doc/install

Terence

Terence Yim |
 Staff Software Engineer | tere...@google.com | 

Miraj Godha

unread,
May 22, 2019, 8:55:17 AM5/22/19
to CDAP User
Hi Terence,

I have already installed GO and that's the reason "make install" command ran successfully.

I am able to successfully test the installation of GO too as stated here: https://golang.org/doc/install#testing

Below is command prompt history:

[root@dopsind366-vm-03 hello]# vi hello.go
[root@dopsind366-vm-03 hello]# cd $HOME/go/src/hello
[root@dopsind366-vm-03 hello]# go build

warning: GOPATH set to GOROOT (/usr/local/go) has no effect


Regards,
Miraj

Regards,
Miraj


Terence Yim

unread,
May 22, 2019, 9:13:26 AM5/22/19
to cdap...@googlegroups.com
Based on the error message, do you have “gcc” in your $PATH?

Terence

Terence Yim |
 Staff Software Engineer | tere...@google.com | 

Miraj Godha

unread,
May 22, 2019, 12:27:02 PM5/22/19
to CDAP User
Yep, silly mistake,

I think we can add on the page to specifically ask users to install gcc

yum -y install gcc

How to check if cdap container successfully started? I am not able to see it here --- kubectl get pods --all-namespaces


Regards,
Miraj


Miraj Godha

unread,
May 22, 2019, 12:31:15 PM5/22/19
to CDAP User
This is the message trace:

{"level":"info","ts":1558542615.161608,"logger":"entrypoint","msg":"setting up manager"}
{"level":"info","ts":1558542615.2129169,"logger":"entrypoint","msg":"Registering Components."}
{"level":"info","ts":1558542615.212966,"logger":"entrypoint","msg":"setting up scheme"}
{"level":"info","ts":1558542615.2131243,"logger":"entrypoint","msg":"Setting up controller"}
{"level":"info","ts":1558542615.2133787,"logger":"kubebuilder.controller","msg":"Starting EventSource","controller":"cdapmaster-ctrl","source":"kind source: /, Kind="}
{"level":"info","ts":1558542615.2135546,"logger":"entrypoint","msg":"setting up webhooks"}
{"level":"info","ts":1558542615.2135782,"logger":"entrypoint","msg":"Starting the Cmd."}
{"level":"info","ts":1558542615.314135,"logger":"kubebuilder.controller","msg":"Starting Controller","controller":"cdapmaster-ctrl"}
{"level":"info","ts":1558542615.4144776,"logger":"kubebuilder.controller","msg":"Starting workers","controller":"cdapmaster-ctrl","worker count":1}

Nothing is logged after this..

Regards,
Miraj

Terence Yim

unread,
May 22, 2019, 12:38:58 PM5/22/19
to cdap...@googlegroups.com
Hi,

There are two way to run the CDAP operator. You can either run it from command line directly (which I believe is what you’ve already done), or you can build a docker image and run it in K8s (https://github.com/cdapio/cdap-operator/blob/develop/config/manager/manager.yaml).

After the controller is running, you can deploy the CDAP CRD https://github.com/cdapio/cdap-operator/blob/develop/config/crds/cdap_v1alpha1_cdapmaster.yaml to the K8s, followed by deploying a CDAP CR to run CDAP inside K8s ( see https://github.com/cdapio/cdap-operator/blob/develop/config/samples/cdap_v1alpha1_cdapmaster.yaml as a simple example).

Terence

Terence Yim |
 Staff Software Engineer | tere...@google.com | 

Nitin Nanda

unread,
May 28, 2019, 5:46:15 AM5/28/19
to CDAP User
Hi,

If anyone could please suggest..except UI pods, nothing else seems to work..
Here is the quick summary:

Made sample secrets & reduced PV sizes as below for testing (on minikube) purpose:

---

echo -n ‘cdap’ > ./username.txt

echo -n ‘root@123’ > ./password.txt


kubectl create secret generic cdap-security --from-file=./username.txt --from-file=./password.txt -n system


administrators-MacBook-Pro:crds nitin.nanda$ kubectl get secret -n system
NAME                    TYPE                                  DATA   AGE
cdap-security           Opaque                                2      132m
default-token-swgzn     kubernetes.io/service-account-token   3      149m
webhook-server-secret   Opaque                                0      149m
administrators-MacBook-Pro:crds nitin.nanda$



administrators-MacBook-Pro:crds nitin.nanda$ kubectl get pvc -n system
NAME                                                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
cdap-master-1-logs-data-cdap-master-1-logs-0             Bound    pvc-833a59d5-8111-11e9-86eb-08002745e6c5   2Gi        RWO            standard       135m
cdap-master-1-messaging-data-cdap-master-1-messaging-0   Bound    pvc-81b043e0-8111-11e9-86eb-08002745e6c5   2Gi        RWO            standard       135m
cdap-master-1-metrics-data-cdap-master-1-metrics-0       Bound    pvc-82c0bce2-8111-11e9-86eb-08002745e6c5   2Gi        RWO            standard       135m
cdap-master-1-preview-data-cdap-master-1-preview-0       Bound    pvc-845bd9b2-8111-11e9-86eb-08002745e6c5   2Gi        RWO            standard       134m

administrators-MacBook-Pro:crds nitin.nanda$ 

---




administrators-MacBook-Pro:crds nitin.nanda$ kubectl get all -n system
NAME                                               READY   STATUS                  RESTARTS   AGE
pod/cdap-controller-0                              1/1     Running                 0          133m
pod/cdap-master-1-appfabric-77969dbfbb-cnbqc       0/1     CrashLoopBackOff        9          26m
pod/cdap-master-1-logs-0                           0/1     Init:CrashLoopBackOff   9          26m
pod/cdap-master-1-messaging-0                      0/1     Init:CrashLoopBackOff   9          26m
pod/cdap-master-1-metadata-86d5b9d66-69bv5         0/1     CrashLoopBackOff        9          26m
pod/cdap-master-1-metrics-0                        0/1     Init:CrashLoopBackOff   9          26m
pod/cdap-master-1-preview-0                        0/1     Init:CrashLoopBackOff   9          26m
pod/cdap-master-1-router-cf84fb6dd-7bvqf           0/1     CrashLoopBackOff        8          26m
pod/cdap-master-1-router-cf84fb6dd-jt2kp           0/1     CrashLoopBackOff        9          26m
pod/cdap-master-1-userinterface-5569c9786b-7tndl   1/1     Running                 0          26m
pod/cdap-master-1-userinterface-5569c9786b-xfvr5   1/1     Running                 0          26m

---------------------------------

1. common exception on both Router & appfabric pods..

SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
2019-05-28 08:01:19,056 - INFO  [main:i.c.c.m.e.k.AbstractServiceMain@120] - Initializing master service class io.cdap.cdap.master.environment.k8s.RouterServiceMain
2019-05-28 08:01:29,675 - INFO  [main:i.c.c.m.e.k.KubeMasterEnvironment@97] - Initializing Kubernetes environment
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at io.cdap.cdap.master.environment.k8s.AbstractServiceMain.main(AbstractServiceMain.java:114)
at io.cdap.cdap.master.environment.k8s.RouterServiceMain.main(RouterServiceMain.java:47)
Caused by: java.lang.RuntimeException: Exception raised when initializing master environment for k8s
at io.cdap.cdap.master.environment.k8s.AbstractServiceMain.init(AbstractServiceMain.java:148)
at io.cdap.cdap.common.runtime.DaemonMain.doMain(DaemonMain.java:37)
... 6 more
Caused by: io.kubernetes.client.ApiException: Forbidden
at io.kubernetes.client.ApiClient.handleResponse(ApiClient.java:882)
at io.kubernetes.client.ApiClient.execute(ApiClient.java:798)
at io.kubernetes.client.apis.CoreV1Api.readNamespacedPodWithHttpInfo(CoreV1Api.java:24591)
at io.kubernetes.client.apis.CoreV1Api.readNamespacedPod(CoreV1Api.java:24573)
at io.cdap.cdap.master.environment.k8s.KubeMasterEnvironment.getPodInfo(KubeMasterEnvironment.java:209)
at io.cdap.cdap.master.environment.k8s.KubeMasterEnvironment.initialize(KubeMasterEnvironment.java:102)
at io.cdap.cdap.master.environment.k8s.AbstractServiceMain.init(AbstractServiceMain.java:146)
... 7 more
administrators-MacBook-Pro:samples nitin.nanda$

------------- 
2. common initialisation issue with Preview , metric , metadata & log pods: 

Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  31m                  default-scheduler  Successfully assigned system/cdap-master-1-preview-0 to minikube
  Normal   Pulled     27m (x5 over 31m)    kubelet, minikube  Container image "gcr.io/cloud-data-fusion-images/cloud-data-fusion:latest" already present on machine
  Normal   Created    27m (x5 over 31m)    kubelet, minikube  Created container create-storage
  Normal   Started    27m (x5 over 30m)    kubelet, minikube  Started container create-storage
  Warning  BackOff    68s (x121 over 29m)  kubelet, minikube  Back-off restarting failed container
administrators-MacBook-Pro:crds nitin.nanda$

3. Common issue messaging, 

Events:
  Type     Reason       Age                    From               Message
  ----     ------       ----                   ----               -------
  Normal   Scheduled    38m                    default-scheduler  Successfully assigned system/cdap-master-1-messaging-0 to minikube
  Warning  FailedMount  38m                    kubelet, minikube  MountVolume.SetUp failed for volume "cdap-security" : couldn't propagate object cache: timed out waiting for the condition
  Warning  FailedMount  38m                    kubelet, minikube  MountVolume.SetUp failed for volume "hadoop-conf" : couldn't propagate object cache: timed out waiting for the condition
  Warning  FailedMount  38m                    kubelet, minikube  MountVolume.SetUp failed for volume "cdap-conf" : couldn't propagate object cache: timed out waiting for the condition
  Normal   Pulled       34m (x5 over 38m)      kubelet, minikube  Container image "gcr.io/cloud-data-fusion-images/cloud-data-fusion:latest" already present on machine
  Normal   Created      34m (x5 over 38m)      kubelet, minikube  Created container create-storage
  Normal   Started      34m (x5 over 38m)      kubelet, minikube  Started container create-storage
  Warning  BackOff      3m21s (x148 over 37m)  kubelet, minikube  Back-off restarting failed container
administrators-MacBook-Pro:crds nitin.nanda$

Would highly appreciate any help!

Thanks
Nitin Nanda
To unsubscribe from this group and stop receiving emails from it, send an email to cdap...@googlegroups.com.

To post to this group, send email to cdap...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cdap-user/b025ca43-b2b2-4627-9723-8a058875443b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap...@googlegroups.com.

To post to this group, send email to cdap...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cdap-user/A589EFD7-B7B1-4072-A193-828E24F884B5%40google.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap...@googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap...@googlegroups.com.

To post to this group, send email to cdap...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cdap-user/0C8E6365-B9D1-4EC2-9D8E-09A264D1B0C6%40google.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap...@googlegroups.com.

Terence Yim

unread,
May 28, 2019, 7:41:56 AM5/28/19
to cdap...@googlegroups.com
Hi,

The cdap-security.xml file has to be in XML format, not a simple test file. The minimal cdap-security.xml file (kind of an empty one) is like this:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
</configuration>

Assuming it is inside a file call cdap-security.xml on your local disk, you can then create the secret with:

kubectl create secret generic cdap-security --from-file=cdap-security.xml

Then in the CDAP YAML file, specify the securitySecret:

securitySecret: cdap-security

Terence

Terence Yim |
 Staff Software Engineer | tere...@google.com | 
To unsubscribe from this group and stop receiving emails from it, send an email to cdap-user+...@googlegroups.com.

To post to this group, send email to cdap...@googlegroups.com.

Nitin Nanda

unread,
May 28, 2019, 8:41:28 AM5/28/19
to cdap...@googlegroups.com
Thanks Terence.

Tried minimal xml you shared & what we use (attached for reference).
Same errors persist for router, appfabric pods. 
Just curious, is there a separate storage class required? or default would work?

SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
2019-05-28 12:39:21,448 - INFO  [main:i.c.c.m.e.k.AbstractServiceMain@120] - Initializing master service class io.cdap.cdap.master.environment.k8s.RouterServiceMain
2019-05-28 12:39:33,009 - INFO  [main:i.c.c.m.e.k.KubeMasterEnvironment@97] - Initializing Kubernetes environment

Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at io.cdap.cdap.master.environment.k8s.AbstractServiceMain.main(AbstractServiceMain.java:114)
at io.cdap.cdap.master.environment.k8s.RouterServiceMain.main(RouterServiceMain.java:47)
Caused by: java.lang.RuntimeException: Exception raised when initializing master environment for k8s
at io.cdap.cdap.master.environment.k8s.AbstractServiceMain.init(AbstractServiceMain.java:148)
at io.cdap.cdap.common.runtime.DaemonMain.doMain(DaemonMain.java:37)
... 6 more
Caused by: io.kubernetes.client.ApiException: Forbidden
at io.kubernetes.client.ApiClient.handleResponse(ApiClient.java:882)
at io.kubernetes.client.ApiClient.execute(ApiClient.java:798)
at io.kubernetes.client.apis.CoreV1Api.readNamespacedPodWithHttpInfo(CoreV1Api.java:24591)
at io.kubernetes.client.apis.CoreV1Api.readNamespacedPod(CoreV1Api.java:24573)
at io.cdap.cdap.master.environment.k8s.KubeMasterEnvironment.getPodInfo(KubeMasterEnvironment.java:209)
at io.cdap.cdap.master.environment.k8s.KubeMasterEnvironment.initialize(KubeMasterEnvironment.java:102)
at io.cdap.cdap.master.environment.k8s.AbstractServiceMain.init(AbstractServiceMain.java:146)
... 7 more
administrators-MacBook-Pro:samples nitin.nanda$

Thanks & regards
Nitin Nanda


cdap-security.xml

Terence Yim

unread,
May 28, 2019, 8:49:23 AM5/28/19
to cdap...@googlegroups.com
Currently the service account running CDAP needs to have permission for service discovery. If you can, just give the default service account cluster-admin permission.

kubectl create clusterrolebinding default-cluster-admin --clusterrole=cluster-admin --serviceaccount=default:default

Terence

Terence Yim |
 Staff Software Engineer | tere...@google.com | 

For more options, visit https://groups.google.com/d/optout.
<cdap-security.xml>

Nitin Nanda

unread,
May 28, 2019, 9:34:17 AM5/28/19
to cdap...@googlegroups.com
Thanks Terence. Same errors persist.

administrators-MacBook-Pro:samples nitin.nanda$ kubectl get clusterrolebinding  | grep defaul
default-cluster-admin                                  19m
administrators-MacBook-Pro:samples nitin.nanda$

My init containers are also crashing...
Init Containers:
  create-storage:
    Container ID:  docker://0efe63ecb6a82efa91764260706e3f58240c67143c70e0fec61e912d137d4abb
    Image:         gcr.io/cloud-data-fusion-images/cloud-data-fusion:latest
    Image ID:      docker-pullable://gcr.io/cloud-data-fusion-images/cloud-data-fusion@sha256:945d8f24155edff76d6515915a4a0ad19d0ed358d459b37fc78e8c5eb5e2bf6f
    Port:          <none>
    Host Port:     <none>
    Args:
      io.cdap.cdap.master.environment.k8s.StorageMain
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 28 May 2019 18:46:49 +0530
      Finished:     Tue, 28 May 2019 18:46:54 +0530
    Ready:          False
    Restart Count:  4
    Environment:    <none>
    Mounts:
      /data from cdap-master-1-logs-data (rw)
      /etc/cdap/conf from cdap-conf (ro)
      /etc/cdap/security from cdap-security (ro)
      /etc/hadoop/conf from hadoop-conf (ro)
      /etc/podinfo from podinfo (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-swgzn (ro)

Appreciate if you could please help.

Thanks & regards
Nitin Nanda

Terence Yim

unread,
May 28, 2019, 9:42:53 AM5/28/19
to cdap...@googlegroups.com
Can you share the logs of the “create-storage” container?

Terence

Terence Yim |
 Staff Software Engineer | tere...@google.com | 

Nitin Nanda

unread,
May 28, 2019, 10:05:29 AM5/28/19
to cdap...@googlegroups.com
Thanks Terence.

Please find attached.

Thanks & regards
Nitin Nanda
Init-container-create-storage.txt

yao...@google.com

unread,
May 28, 2019, 7:24:02 PM5/28/19
to CDAP User
Hi Nitin,

From the logs, CDAP is able to successfully load the jdbc driver but the jdbc connection string is <JDBC Connection String>, which is not a valid jdbc string. Are you trying to use SQL as your storage backend? Did you use this sample yaml file https://github.com/cdapio/cdap-operator/blob/13cae32086d7a9bb083ee6f0e6ddb9eb9b8f4aa4/config/samples/cdap_v1alpha1_cdapmaster.yaml? If so, you will need to set the correct jdbc connection string for the property "data.storage.sql.jdbc.connection.url". 
Please let me know if you have any question.

Thanks,
Yaojie

-- 
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap...@googlegroups.com.
To post to this group, send email to cdap...@googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap...@googlegroups.com.
To post to this group, send email to cdap...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap...@googlegroups.com.

To post to this group, send email to cdap...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap...@googlegroups.com.

To post to this group, send email to cdap...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap...@googlegroups.com.

To post to this group, send email to cdap...@googlegroups.com.

Nitin Nanda

unread,
May 29, 2019, 1:34:57 PM5/29/19
to cdap...@googlegroups.com
Thanks a lot Terence & Yaojie. 
Finally Postgresql & ES (attached are screenshots) are in place on my minikube cluster..had to play hard with ES_JAVA_OPTS etc..
Pods are in ContainerCreating mode...pulling images (though imagePullPolicy: IfNotPresent but still taking time..) please standby :)

Thanks & regards
Nitin Nanda
To unsubscribe from this group and stop receiving emails from it, send an email to cdap-user+...@googlegroups.com.

To post to this group, send email to cdap...@googlegroups.com.
Screenshot 2019-05-29 at 9.28.29 AM.png
Screenshot 2019-05-29 at 11.58.48 AM.png
Screenshot 2019-05-29 at 9.41.09 PM.png

Nitin Nanda

unread,
May 30, 2019, 3:23:16 AM5/30/19
to cdap...@googlegroups.com
Thanks Terence, Yaojie & Group, 

After ES (with single-node discovery)+ Postgresql are up on my minikube cluster..'create-storage' containers seem to launch fine with graceful exit (0). Please refer attachments. 
However, all the pods except UI have same exception as attached in logs which is making CDAP opeator to crash.
Request to please suggest.

Any help would be highly appreciated.

Thanks & regards
Nitin Nanda



logs.txt
Screenshot 2019-05-30 at 11.51.22 AM.png
Screenshot 2019-05-30 at 12.46.19 PM.png

Nitin Nanda

unread,
Jun 3, 2019, 2:25:19 AM6/3/19
to cdap...@googlegroups.com
Dear Group,

CDAP Cluster is up. However, there are 2 issues found with:

1. pod/cdap-master-1-appfabric, logs attached 
2. pod/cdap-master-1-preview-0, logs attached.

Request CDAP-google team to please guide on this.

Thanks & Regards
Nitin Nanda

appfabric.txt
preview.txt

Terence Yim

unread,
Jun 3, 2019, 12:05:13 PM6/3/19
to cdap...@googlegroups.com
Hi,

For the exception in app-fabric, what is the config you set for the “locationURI”? Seems like you are pointing it to some GCS bucket? If that’s the case, you have to make sure your K8s cluster is configured to be able to talk to GCS.

I don’t see any error in the preview log, is there anything I should be looking for?

Terence

Terence Yim |
 Staff Software Engineer | tere...@google.com | 

For more options, visit https://groups.google.com/d/optout.
<appfabric.txt><preview.txt>

Nitin Nanda

unread,
Jun 11, 2019, 5:16:11 PM6/11/19
to cdap...@googlegroups.com
Thanks Terence.
I added a minio bucket URI...just for test purpose & so far all looks good! No restart for past 12 mins after adding this object store. Thanks a lot!
Now, for some reason i'm hitting Auth issue after adding URI. May be i'm missing something in site.xml?

Any pointers will be highly appreciated.

Thanks & regards
Nitin Nanda


Screenshot 2019-06-12 at 2.39.30 AM.png

Nitin Nanda

unread,
Jun 12, 2019, 1:16:26 PM6/12/19
to cdap...@googlegroups.com
Dear Terence & Group,

Request for your kind guidance on below points:

1. Is external Hadoop cluster still required? to run app/pipelines for CDAP on K8S? If yes, what are the prerequisites?
2. What are the changes needed in core-site.xml to access gs://bucket? I'm getting below error for appfabric & preview pods:

1) Error in custom provider, java.lang.RuntimeException: java.io.IOException: Error getting access token from metadata server at: http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/tokenCaused by: java.net.NoRouteToHostException: No route to host (Host unreachable)

3. Metric pods have service exceptions:
2019-06-12 16:27:48,846 - WARN  [MetricsAdminSubscriberService:i.c.c.m.s.AbstractMessagingPollingService@131] - Failed to contact service messaging.service. Will retry in next run.
io.cdap.cdap.common.ServiceUnavailableException: Service 'messaging.service' is not available. Please wait until it is up and running.
at io.cdap.cdap.common.internal.remote.RemoteClient.resolve(RemoteClient.java:113) ~[na:na]
2019-06-12 16:28:50,958 - WARN  [ProcessMetricsThread-topic:system.metrics7:i.c.c.m.p.MessagingMetricsProcessorService$ProcessMetricsThread@417] - Failed to process metrics. Will be retried in next iteration.
java.net.SocketTimeoutException: connect timed out

4. When is spark on kubernetes planned? for CDAP on K8S based Apps/pipeline?

Would appreciate any revert at earliest.
metric-pod logs

Nitin Nanda

unread,
Jun 12, 2019, 10:41:17 PM6/12/19
to cdap...@googlegroups.com
Dear Group,

Can someone please advice why is it mandatory to use locationURI: gs:// somebucket if my CDAP-operator is an on-prem deployment.
Also, i'm stuck at below for appFabric pods, i'm surely missing on CDAP-operator & Hadoop integration...Can you one pls. advice.

2019-06-13 02:30:39,241 - INFO  [main:i.c.c.s.d.s.j.DataSourceProvider@93] - Creating the DataSource with jdbc url: jdbc:postgresql://gltest001-mgt-01.gvs.ggn/postgres?user=postgres&password=postgres

Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at io.cdap.cdap.master.environment.k8s.AbstractServiceMain.main(AbstractServiceMain.java:114)
at io.cdap.cdap.master.environment.k8s.AppFabricServiceMain.main(AppFabricServiceMain.java:77)
Caused by: com.google.inject.ProvisionException: Unable to provision, see the following errors:


1) Error in custom provider, java.lang.RuntimeException: java.io.IOException: Error getting access token from metadata server at: http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token
  while locating io.cdap.cdap.common.guice.FileContextProvider
  at io.cdap.cdap.common.guice.DFSLocationModule.configure(DFSLocationModule.java:44)
  while locating org.apache.hadoop.fs.FileContext
  while locating io.cdap.cdap.common.guice.DFSLocationModule$LocationFactoryProvider
  at io.cdap.cdap.common.guice.DFSLocationModule.configure(DFSLocationModule.java:45)
  while locating org.apache.twill.filesystem.LocationFactory
    for parameter 2 at io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetAdminService.<init>(DatasetAdminService.java:69)
  while locating io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetAdminService
    for parameter 0 at io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetAdminOpHTTPHandler.<init>(DatasetAdminOpHTTPHandler.java:62)
  while locating io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetAdminOpHTTPHandler
  while locating io.cdap.http.HttpHandler annotated with @com.google.inject.multibindings.Element(setName=@com.google.inject.name.Named(value=dataset.executor),uniqueId=23, type=MULTIBINDER, keyType=)
  at io.cdap.cdap.data.runtime.DataSetServiceModules$2$1.configure(DataSetServiceModules.java:132) (via modules: io.cdap.cdap.data.runtime.DataSetServiceModules$2 -> io.cdap.cdap.data.runtime.DataSetServiceModules$2$1 -> com.google.inject.multibindings.Multibinder$RealMultibinder)
  while locating java.util.Set<io.cdap.http.HttpHandler> annotated with @com.google.inject.name.Named(value=dataset.executor)
    for parameter 3 at io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetOpExecutorService.<init>(DatasetOpExecutorService.java:57)
  at io.cdap.cdap.data.runtime.DataSetServiceModules$2$1.configure(DataSetServiceModules.java:137) (via modules: io.cdap.cdap.data.runtime.DataSetServiceModules$2 -> io.cdap.cdap.data.runtime.DataSetServiceModules$2$1)

Thanks & regards
Nitin Nanda

Nitin Nanda

unread,
Jun 13, 2019, 4:02:45 AM6/13/19
to cdap...@googlegroups.com
Dear Terence & CDAP experts,

I have a running HDP cluster & trying to make it talk to CDAP on K8S.

[root@gltest001-mgt-01 samples]# hadoop fs -ls hdfs://reflex-platform-gltest001/
Found 10 items
drwxrwxrwx   - yarn   hadoop          0 2019-06-11 09:47 hdfs://reflex-platform-gltest001/app-logs
drwxr-xr-x   - hdfs   hdfs            0 2019-06-11 09:48 hdfs://reflex-platform-gltest001/apps
drwxr-xr-x   - yarn   hadoop          0 2019-06-11 09:47 hdfs://reflex-platform-gltest001/ats
drwxr-xr-x   - hdfs   hdfs            0 2019-06-11 09:47 hdfs://reflex-platform-gltest001/hdp
drwxr-xr-x   - mapred hdfs            0 2019-06-11 09:47 hdfs://reflex-platform-gltest001/mapred
drwxrwxrwx   - mapred hadoop          0 2019-06-11 09:48 hdfs://reflex-platform-gltest001/mr-history
drwxr-xr-x   - hdfs   hdfs            0 2019-06-11 09:47 hdfs://reflex-platform-gltest001/ranger
drwxrwxrwx   - spark  hadoop          0 2019-06-13 07:19 hdfs://reflex-platform-gltest001/spark2-history
drwxrwxrwx   - hdfs   hdfs            0 2019-06-11 09:48 hdfs://reflex-platform-gltest001/tmp
drwxr-xr-x   - hdfs   hdfs            0 2019-06-11 09:49 hdfs://reflex-platform-gltest001/user
[root@gltest001-mgt-01 samples]#

Set my locationURI: hdfs://reflex-platform-gltest001
My core-site.xml has below:

    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://reflex-platform-gltest001</value>
      <final>true</final>
    </property>

But now i'm stuck with host exception. Looks like there is no DNS resolution for hdfs nameservice.
Could someone please guide on CDAP on k8s & Hadoop integration, what properties are needed on core-site.xml and cdap-site.xml.

2019-06-13 07:13:41,322 - INFO  [main:i.c.c.s.d.s.j.DataSourceProvider@155] - Successfully loaded org.postgresql.Driver from /opt/cdap/master/ext/jdbc
2019-06-13 07:13:41,328 - INFO  [main:i.c.c.s.d.s.j.DataSourceProvider@93] - Creating the DataSource with jdbc url: jdbc:postgresql://gltest001-mgt-01.gvs.ggn/postgres?user=postgres&password=postgres
2019-06-13 07:13:41,436 - ERROR [main:o.a.h.f.FileContext@469] - java.net.UnknownHostException: reflex-platform-gltest001
Exception in thread "main" java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at io.cdap.cdap.master.environment.k8s.AbstractServiceMain.main(AbstractServiceMain.java:114)
	at io.cdap.cdap.master.environment.k8s.AppFabricServiceMain.main(AppFabricServiceMain.java:77)
Caused by: com.google.inject.ProvisionException: Unable to provision, see the following errors:

1) Error in custom provider, java.lang.RuntimeException: java.net.UnknownHostException: reflex-platform-gltest001
  while locating io.cdap.cdap.common.guice.FileContextProvider
  at io.cdap.cdap.common.guice.DFSLocationModule.configure(DFSLocationModule.java:44)
  while locating org.apache.hadoop.fs.FileContext
  while locating io.cdap.cdap.common.guice.DFSLocationModule$LocationFactoryProvider
  at io.cdap.cdap.common.guice.DFSLocationModule.configure(DFSLocationModule.java:45)
  while locating org.apache.twill.filesystem.LocationFactory
    for parameter 2 at io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetAdminService.<init>(DatasetAdminService.java:69)
  while locating io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetAdminService
    for parameter 0 at io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetAdminOpHTTPHandler.<init>(DatasetAdminOpHTTPHandler.java:62)
  while locating io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetAdminOpHTTPHandler
  while locating io.cdap.http.HttpHandler annotated with @com.google.inject.multibindings.Element(setName=@com.google.inject.name.Named(value=dataset.executor),uniqueId=23, type=MULTIBINDER, keyType=)
  at io.cdap.cdap.data.runtime.DataSetServiceModules$2$1.configure(DataSetServiceModules.java:132) (via modules: io.cdap.cdap.data.runtime.DataSetServiceModules$2 -> io.cdap.cdap.data.runtime.DataSetServiceModules$2$1 -> com.google.inject.multibindings.Multibinder$RealMultibinder)
  while locating java.util.Set<io.cdap.http.HttpHandler> annotated with @com.google.inject.name.Named(value=dataset.executor)
    for parameter 3 at io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetOpExecutorService.<init>(DatasetOpExecutorService.java:57)
  at io.cdap.cdap.data.runtime.DataSetServiceModules$2$1.configure(DataSetServiceModules.java:137) (via modules: io.cdap.cdap.data.runtime.DataSetServiceModules$2 -> io.cdap.cdap.data.runtime.DataSetServiceModules$2$1)
Thanks in advance.

Regards,

Terence Yim

unread,
Jun 13, 2019, 12:51:58 PM6/13/19
to cdap...@googlegroups.com
Hi Nitin,

The locationURI can also be a HDFS path if you run K8s together with a perm Hadoop that it can access.

Terence

Terence Yim |
 Staff Software Engineer | tere...@google.com | 

Terence Yim

unread,
Jun 13, 2019, 12:56:21 PM6/13/19
to cdap...@googlegroups.com
Hi,

1. Yes, a Hadoop cluster is still needed for pipeline execution. The only requirement is that SSH is allowed from containers inside K8s SSHing into an edge node that can submit jobs to the Hadoop cluster. You can use the Remote Hadoop Provisioner to create a compute profile for pipeline execution.
2. You don’t need to use GCS bucket if your deployment is on-perm. You can set the locationURI to “hdfs://<hdfs-name-node>/cdap”.
3. Does the error persist or eventually goes away? It is normal if you see such error in the first minute or so when CDAP is still starting up in K8s.
4. It is in our roadmap, but there is no concrete timeline defined yet.

Terence

Terence Yim |
 Staff Software Engineer | tere...@google.com | 

For more options, visit https://groups.google.com/d/optout.
<metric-pod>

Nitin Nanda

unread,
Jun 13, 2019, 2:17:19 PM6/13/19
to cdap...@googlegroups.com
Thanks Terence. Much appreciated!!
Yup, i used the same & finally the cluster is IP.
There are 2 challenges:

1. The remote compute profile just gets stuck in create state. This is when i create it for default user 'Yarn'. The initialisation source is 'yarn-env.sh'. Not sure if there are some more pre-reqs for this..?

2. When i created my own dataproc profile, it gives me below errors:

019-06-13 16:14:46,453 - ERROR [appfabric-executor-18:i.c.c.c.HttpExceptionHandler@70] - Unexpected error: request=GET /v3/namespaces/default/securekeys user=<null>:
java.lang.UnsupportedOperationException: Secure store is not configured. To use secure store please set "security.store.provider" property in cdap-site.xml.
at io.cdap.cdap.security.store.DummySecureStoreService.list(DummySecureStoreService.java:39) ~[na:na]
at io.cdap.cdap.security.store.DefaultSecureStoreService.list(DefaultSecureStoreService.java:71) ~[na:na]
at io.cdap.cdap.security.store.SecureStoreHandler.list(SecureStoreHandler.java:129) ~[na:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_212]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_212]
<<SNIPPED>>
2019-06-13 17:39:04,745 - INFO  [appfabric-executor-16:i.c.c.i.a.s.ProgramLifecycleService@426] - Attempt to run Workflow program DataPipelineWorkflow as user root
2019-06-13 17:39:05,609 - ERROR [provisioning-service-13:i.c.c.i.p.t.ProvisioningTask@114] - PROVISION task failed in REQUESTING_CREATE state for program run program_run:default.nitin-cdapo-test.-SNAPSHOT.workflow.DataPipelineWorkflow.23715730-8e02-11e9-ae64-02420ae95005.
com.fasterxml.jackson.core.JsonParseException: Unexpected character ('-' (code 45)) in numeric value: expected digit (0-9) to follow minus sign, for valid numeric value
 at [Source: java.io.ByteArrayInputStream@1a1eea20; line: 1, column: 3]
at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1378) ~[com.fasterxml.jackson.core.jackson-core-2.9.2.jar:2.9.2]
at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:599) ~[com.fasterxml.jackson.core.jackson-core-2.9.2.jar:2.9.2]
at com.fasterxml.jackson.core.base.ParserBase.reportUnexpectedNumberChar(ParserBase.java:975) ~[com.fasterxml.jackson.core.jackson-core-2.9.2.jar:2.9.2]

Your guidance would be highly appreciated.

Thanks & regards
Nitin Nanda 
 

Terence Yim

unread,
Jun 13, 2019, 2:24:22 PM6/13/19
to cdap...@googlegroups.com
Hi,

1. Typically you don’t need to specify initialization action. When you said it is stuck, was there any logics from the app-fabric pod that might be related to the problem?
2. For Dataproc profile, what did you specify in the “Service Account Key” field? Are you using secure macro? If you do, you need to configure a secure store for CDAP to store the secret. If you want to skip setting up the secure store for CDAP, you can just paste the service account key JSON as plain text into the configuration box (not recommended for production setup, but for development on local machine only).

Terence

Terence Yim |
 Staff Software Engineer | tere...@google.com | 

Nitin Nanda

unread,
Jun 13, 2019, 2:54:15 PM6/13/19
to cdap...@googlegroups.com
Thanks Terence for the quick revert!
1. I didn't specify the initialisation action this time, result is same. It stays in create mode (refer attachment). Not sure where to look for logs for profile creation?
2. I used the "private key" value of JSON. It still has same errors & pipeline fails. Refer attached screenshot, logs show nothing..?

Would appreciate any help.

Thanks & Regards
Nitin Nanda
Screenshot 2019-06-14 at 12.13.15 AM.png
Screenshot 2019-06-14 at 12.23.05 AM.png

Nitin Nanda

unread,
Jun 14, 2019, 5:53:05 AM6/14/19
to cdap...@googlegroups.com
Thanks Terence. 
The remote compute profile is not getting created it stays in create state. Below is what i did..please do correct me if i am wrong..
1. Created ssh-key for 'yarn' user on my management node which is running cdap-operator.  I added the pub key to the .ssh/authorized_keys of my data node (gltest001-slv-01.gvs.ggn) which i'll use in remote profile.
[yarn@gltest001-mgt-01 ~]$ ssh-keygen -f ~/.ssh/yarn    <<-- on my mgmt. node.
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/yarn/.ssh/yarn.
Your public key has been saved in /home/yarn/.ssh/yarn.pub.  <<<- added this public key to datanode (gltest001-slv-0). As shown below.
The key fingerprint is:
5a:c3:2d:09:44:76:d7:99:8c:1e:3a:54:8b:0a:64:c4 ya...@gltest001-mgt-01.gvs.ggn
The key's randomart image is:
+--[ RSA 2048]----+
| o+  .+.oooo .   |
| oE  o.ooo. o    |
|  .  .oo..       |
|   . .oo.o       |
|    .  .S .      |
|       o o       |
|      .          |
|                 |
|                 |
+-----------------+
[yarn@gltest001-slv-01 ~]$ vi .ssh/authorized_keys <<-- added pub key
[yarn@gltest001-slv-01 ~]$
[yarn@gltest001-slv-01 ~]$ cat .ssh/authorized_keys
sh-rsa 
### ya...@gltest001-mgt-01.gvs.ggn
[yarn@gltest001-slv-01 ~]$

2. I have below properties set on my HDP, this is non kerborised set-up. Please suggest if Authorization & key path needed to be given here?

 <property>
      <name>hadoop.http.staticuser.user</name>
      <value>yarn</value>
    </property>

    <property>
      <name>hadoop.security.authorization</name>
      <value>false</value>
    </property>

    <property>
      <name>hadoop.security.key.provider.path</name>
      <value></value>
    </property>

Any help would be highly appreciated.

Thanks & regards
Nitin Nanda

ruchipa...@gmail.com

unread,
Apr 27, 2020, 8:08:34 AM4/27/20
to CDAP User
Hi Terence,

Is the support for Spark  on CDAP for K8S provided or it is not yet planned?



On Thursday, June 13, 2019 at 10:26:21 PM UTC+5:30, Terence Yim wrote:
Hi,

Terence Yim

unread,
Jun 15, 2020, 4:11:37 AM6/15/20
to CDAP User
Hi,

It is not yet available.

Terence

--
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap-user+...@googlegroups.com.


--
Reply all
Reply to author
Forward
0 new messages