Not able to view applications in Datatorrent Monitor Tab

41 views
Skip to first unread message

Sai Karthik

unread,
May 2, 2016, 8:25:46 PM5/2/16
to Malhar
After launching the pi-demo or any demo application successfully, when I click on the application id to monitor it, I get the error "An error occurred fetching data.". Basically, I am not able to see any applications in Monitor tab.
I can view these running or finished applications in Resource Manager UI. Am I missing any configuration?

My cluster details are
HDP 2.3.4
Java 1.8.0
DataTorrent Enterprise edition 30 day trial

Amol Kekre

unread,
May 2, 2016, 8:28:11 PM5/2/16
to malhar...@googlegroups.com, us...@apex.incubator.apache.org

Sai,
I am redirecting this thread to users@apex. Do subscribe to users@ email. See http://apex.apache.org/community.html for details

Thks,
Amol


--
You received this message because you are subscribed to the Google Groups "Malhar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to malhar-users...@googlegroups.com.
To post to this group, send email to malhar...@googlegroups.com.
Visit this group at https://groups.google.com/group/malhar-users.
For more options, visit https://groups.google.com/d/optout.

David Yan

unread,
May 2, 2016, 8:36:45 PM5/2/16
to malhar...@googlegroups.com, us...@apex.incubator.apache.org
Hi Sai:
Can you monitor the dtgateway.log file and send us the possible stack trace that is causing the error?

David

Sai Karthik

unread,
May 2, 2016, 8:52:37 PM5/2/16
to malhar...@googlegroups.com
I only see that the application being submitted in dtgateway.log for that particular application. Even though I see this application is in RUNNING state in ResourceManager, this is the last log I see in dtgateway.log


2016-05-02 19:56:17,767 INFO com.datatorrent.common.util.AsyncFSStorageAgent: using /tmp/chkp1274056272325866897 as the basepath for checkpointing.
2016-05-02 19:56:17,958 INFO com.datatorrent.stram.StramClient: Set the environment for the application master
2016-05-02 19:56:17,958 INFO com.datatorrent.stram.StramClient: Setting up app master command
2016-05-02 19:56:17,976 INFO com.datatorrent.stram.StramClient: Completed setting up app master command ${JAVA_HOME}/bin/java -Djava.io.tmpdir=$PWD/tmp -Xmx768m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/dt-heap-1.bin -Dhadoop.root.logger=INFO,RFA -Dhadoop.log.dir=<LOG_DIR> -Ddt.attr.APPLICATION_PATH=hdfs://naga-spark-test-1.novalocal:8020/user/dtadmin/datatorrent/apps/application_1462233196804_0001 com.datatorrent.stram.StreamingAppMaster 1><LOG_DIR>/AppMaster.stdout 2><LOG_DIR>/AppMaster.stderr
2016-05-02 19:56:17,980 INFO com.datatorrent.stram.StramClient: Submitting application: {name=PiDemo, queue=default, user=dtadmin (auth:SIMPLE), resource=<memory:1024, vCores:0>}


You received this message because you are subscribed to a topic in the Google Groups "Malhar" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/malhar-users/GsEcpTeWVcY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to malhar-users...@googlegroups.com.

David Yan

unread,
May 2, 2016, 8:57:49 PM5/2/16
to malhar...@googlegroups.com
Hi Sai,

Can you issue these two curl commands and send us the output?



Thanks,

David

Sai Karthik

unread,
May 2, 2016, 9:55:59 PM5/2/16
to malhar...@googlegroups.com
output for "curl http://localhost:9090/ws/v2/about"
{"version":"3.2.0-incubating","buildDate":"23.10.2015 @ 16:12:06 PDT","buildRevision":"rev: d61ca61 branch: release-3.2","buildVersion":"3.2.0-incubating from rev: d61ca61 branch: release-3.2 by Thomas Weise on 23.10.2015 @ 16:12:06 PDT","buildUser":"Thomas Weise","javaVersion":"1.8.0_51","gatewayUser":"dtadmin","hadoopLocation":"\/usr\/bin\/hadoop","jvmName":"13508@Gateway-FQDN","configDirectory":"\/opt\/datatorrent\/releases\/3.2.0\/conf","hadoopIsSecurityEnabled":false,"hostname":"Gateway-FQDN"}



HTTP/1.1 200 OK
Date: Tue, 03 May 2016 01:54:29 GMT
Content-Type: application/json
Transfer-Encoding: chunked

--
You received this message because you are subscribed to a topic in the Google Groups "Malhar - Deprecated-Use-Apache-Apex-Forums" group.

David Yan

unread,
May 2, 2016, 9:59:10 PM5/2/16
to malhar...@googlegroups.com
Interesting. So if you do:


(without the -D -)
It returns nothing at all?

David

--
You received this message because you are subscribed to the Google Groups "Malhar - Deprecated-Use-Apache-Apex-Forums" group.

Sai Karthik

unread,
May 2, 2016, 10:04:14 PM5/2/16
to malhar...@googlegroups.com
My bad. output for "curl -D - http://localhost:9090/ws/v2/applications"
HTTP/1.1 200 OK
Date: Tue, 03 May 2016 02:02:17 GMT
Content-Type: application/json
Transfer-Encoding: chunked

{"apps":[]}

{"apps":[]}[



David Yan

unread,
May 3, 2016, 1:42:59 AM5/3/16
to malhar...@googlegroups.com
Hi Sai,

The empty app list indicates that no Apex apps are running and the UI should not say "An error occurred...".
When you issued the curl command and received an empty "apps" list, were any Apex apps running at all in the cluster?

David 

Sai Karthik

unread,
May 3, 2016, 2:07:03 AM5/3/16
to malhar...@googlegroups.com
Hi David,
Yes, when I issued curl commands, the pi-demo and wordcount examples that come with datatorrent installation are running (I can see these applications in RM). I launched them using the DataTorrent console. I can see the output for wordcount example in container logs. Do you think there are any permission issues? (I dont have any kerberoes setup)

David Yan

unread,
May 3, 2016, 2:30:51 PM5/3/16
to malhar...@googlegroups.com
Hi Sai,

If the /ws/v2/applications REST call returns an empty list, the UI should not even show any application ids for you to click on.
It also should not be permissions problem either because a simple curl call with no cookie does not give you a 401 error.

I need further help from you to troubleshoot. In your browser, can you open the "Developer tools" (in Chrome: Tools -> Developer tools) and click on the "Network" tab, reload the Monitor page, and try to reproduce the "An error occurred..." scenario, and send us the detail (in particular the response) of any error (status codes 400s and 500s) in the HTTP call list?

Thanks,

David

Sai Karthik

unread,
May 3, 2016, 3:17:21 PM5/3/16
to malhar...@googlegroups.com
Hi David,
So when I click on Launch button for the application, while loading the UI Dialog the error is
package.json =>
General
Request URL:http://dtGateway-FQDN.novalocal:9090/ws/v2/appPackages/dtadmin/wordcount-demo/3.2.0-incubating/resources/configUI/package.json
Request Method:GET
Status Code:404 Not Found
Remote Address:dt-Gateway-publicIP:9090

Response Headers
Content-Type:application/json
Date:Tue, 03 May 2016 18:50:39 GMT
Transfer-Encoding:chunked

Request Headers
Accept:application/json, text/plain, */*
Accept-Encoding:gzip, deflate, sdch
Accept-Language:en-US,en;q=0.8
Connection:keep-alive
Host:dtGateway-FQDN.novalocal:9090
Referer:http://dtGateway-FQDN.novalocal:9090/static/
User-Agent:Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.76 Mobile Safari/537.36
                           Response: {"message":"Not found"}


After successful launch and I click on application id, redirected to Monitor Tab with error "An error occurred fetching data."

application_1462233196804_0005 => 
General 
Request URL:http://dtGateway-FQDN:9090/ws/v2/applications/application_1462233196804_0005
Request Method:GET
Status Code:404 Not Found
Remote Address:dt-Gateway-publicIP:9090

Response Headers
Content-Type:application/json
Date:Tue, 03 May 2016 19:03:40 GMT
Transfer-Encoding:chunked

Request Headers
Accept:application/json, text/plain, */*
Accept-Encoding:gzip, deflate, sdch
Accept-Language:en-US,en;q=0.8
Connection:keep-alive
Host:dtGateway-FQDN.novalocal:9090
Referer:http://dtGateway-FQDN:9090/static/
User-Agent:Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.76 Mobile Safari/537.36

Response: {"message":"Application application_1462233196804_0005 not found"}

Hope this helps. Let me know if you need further details for debugging. What I understand is that dtManage is unable to communicate with ResourceManager.

David Yan

unread,
May 3, 2016, 4:13:56 PM5/3/16
to malhar...@googlegroups.com
Hi Sai,

Thanks for the detail info. The first 404 is okay. But the second 404 says that dtGateway is unable to communicate with the app master. 
With application_1462233196804_0005 still running, can you confirm that /ws/v2/applications still returns an empty list? Also, can you send the dtgateway.log and the application_1462233196804_0005 app master log entries around the time when the second 404 occurred?

Thanks,

David

David Yan

unread,
May 3, 2016, 4:24:55 PM5/3/16
to malhar...@googlegroups.com
Also, if /ws/v2/applications returns an empty list when the app is running, please try restarting dtgateway and try again in the browser. There is a known bug in 3.2.0 that can cause this problem which was fixed in 3.3.0. You can download the latest version of RTS from the DataTorrent web site.

David

Sai Karthik

unread,
May 3, 2016, 4:37:25 PM5/3/16
to malhar...@googlegroups.com
Haha, restart did work. Thanks for your time and patience David.

David Yan

unread,
May 3, 2016, 4:39:10 PM5/3/16
to malhar...@googlegroups.com
Great! Please upgrade to 3.3.0 if possible to avoid getting into this issue in the future.

MOOLAMREDDY SIDDA REDDY 15MCB1013

unread,
Sep 22, 2016, 5:19:25 AM9/22/16
to Malhar - Deprecated-Use-Apache-Apex-Forums

please resovle that error how can i set my hadoop location in data torrent console please post me 
thanks for advance
Screenshot from 2016-09-22 14:48:03.png
Reply all
Reply to author
Forward
0 new messages