Kylo - sample template - Data Ingest - Feed stuck on step 2 "Initialize cleanup Parameters"

547 views
Skip to first unread message

vijayarajan marimuthu

unread,
Apr 12, 2017, 1:44:46 PM4/12/17
to Kylo Community
Hi Kylo community,

 I've installed Kylo on my HDP cluster.
After successful log on , I've imported the sample template "Data Ingest" and created new feed for "user data" as instructed in the demo video.

When i try to run the feed with sample data (userdata2.csv) , the feed is keep running for more than 2 hrs and got stuck in step 2:  with out showing any error .

Here by I've attached the screen shot of the Feed Job steps, Please let me know where to start the troubleshoot ?

Thanks
Vijay





kylo_feed_job_steps.PNG

Matt Hutton

unread,
Apr 12, 2017, 2:05:45 PM4/12/17
to Kylo Community

First check your installation.  When you import the data ingest template, did you check 'import re-usable template' when prompted?  If not, please reimport.   To verify everything is correctly setup, go to NiFi and you should see a process group called 'reusable_templates' with an embedded process group called 'standard-ingest'.   Then verify your feed shows up in the top-level 'webapps' process group and embedded process group called 'userdata'. You should see the output ports wired to the import port of the standard-ingest process group in reusable_templates. 

Also, it looks like the job you attached was a delete operation vs. an ingest.  This would be triggered by selecting 'Delete feed' from the feed's menu. Once you verify your install, please try again.  You can abandon the long running job in the operations manager.

vijayarajan marimuthu

unread,
Apr 14, 2017, 11:49:26 AM4/14/17
to Kylo Community
Thanks Matt.

After re-import the template, the issue resolved. But  again the job failed with the feed "index schema service" step 3 : Query Hive Table Schema with the following error

ExecuteSQL[id=2bd64d86-2b1f-1ef0-f632-284eb777a3f7] Unable to execute SQL select query SELECT d.NAME DATABASE_NAME, d.OWNER_NAME OWNER, t.CREATE_TIME, t.TBL_NAME, t.TBL_TYPE, c.COLUMN_NAME, c.TYPE_NAME FROM hive.COLUMNS_V2 c JOIN hive.SDS s on s.CD_ID = c.CD_ID JOIN hive.TBLS t ON s.SD_ID=t.SD_ID JOIN hive.DBS d on d.DB_ID = t.DB_ID where d.name = 'webapps'and t.tbl_name = 'user_data_ingest'; for StandardFlowFileRecord[uuid=a982109a-1fc3-4a0c-9324-ad04c8733a71,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1492184461121-1, container=default, section=1], offset=281015, length=48],offset=0,name=349054303986753,size=48] due to org.apache.nifi.processor.exception.ProcessException: org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Could not connect to address=(host=localhost)(port=3306)(type=master) : Connection refused); routing to failure: org.apache.nifi.processor.exception.ProcessException: org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (Could not connect to address=(host=localhost)(port=3306)(type=master) : Connection refused) 


I've configured /opt/kylo/kylo-services/conf/application.properties file with following properties

hive.metastore.datasource.driverClassName=org.mariadb.jdbc.Driver
hive.metastore.datasource.url=jdbc:mysql://xx.xxx.x.xxx:3306/hive

I'm not sure, how the host=localhost appearing in the error instead of configured ip address ?

Please let me know how to change the database url ?




Greg Hart

unread,
Apr 14, 2017, 5:38:57 PM4/14/17
to Kylo Community
Hi Vijayarajan,

The Query Hive Table Schema uses the MySQL controller service in NiFi. You can either modify the properties for this controller service or create a new controller service and modify the processor to use this controller service.

vijayarajan marimuthu

unread,
Apr 17, 2017, 1:28:19 PM4/17/17
to Kylo Community
Thanks Greg,

I've modified the properties for mysql controller service and now I'm getting the following error in Step 16 - Validate And Split Records

2017-04-17 12:53:10,526 ERROR [Timer-Driven Process Thread-9] c.t.nifi.v2.spark.ExecuteSparkJob ExecuteSparkJob[id=2b1f1ef0-4e06-1bd6-cbce-5e6977fd5e56] ExecuteSparkJob for Validate And Split Records and flowfile: StandardFlowFileRecord[uuid=9b2ade6c-838f-4bbb-997b-efaceca2f93f,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1492447947624-1,container=default, section=1], offset=140538, length=140429],offset=0,name=userdata2.csv,size=140429] completed with failed status 1

Greg Hart

unread,
Apr 17, 2017, 2:15:08 PM4/17/17
to Kylo Community
Hi Vijayarajan,

Could you attach your /var/log/nifi/nifi-app.log file?

Thanks!

vijayarajan marimuthu

unread,
Apr 17, 2017, 3:50:08 PM4/17/17
to Kylo Community
Greg,

Here by i"ve attached nifi-app.log file
nifi-app.log

Greg Hart

unread,
Apr 17, 2017, 3:56:53 PM4/17/17
to Kylo Community
It looks like you're running into this issue:

To fix the issue, you can take these steps:
  1. On the edge node, edit the file: /usr/hdp/current/spark-client/conf/spark-defaults.conf
  2. Add these configuration entries to the file:
spark.sql.hive.convertMetastoreOrc false
spark
.sql.hive.convertMetastoreParquet false


vijayarajan marimuthu

unread,
Apr 18, 2017, 3:56:22 PM4/18/17
to Kylo Community
Thanks Greg,

The data ingest job completed successfully , but "index_schema_service" is got stuck in step 2:






job_status.PNG
job_steps.PNG

Greg Hart

unread,
Apr 18, 2017, 5:55:05 PM4/18/17
to Kylo Community
Hi Vijayarajan,

Could you check in the NiFi UI and the NiFi logs to see if you're getting an error message?

Thanks!

vijayarajan marimuthu

unread,
Apr 20, 2017, 11:21:47 AM4/20/17
to Kylo Community
Greg

Yes! I've checked NiFi UI and found the MySQL connection is disabled due to modification of proper ip address. After I've enabled the MySQL connection, the Job completed successfully.

Thanks for your support!

Thanks
Vijay

dinesh.h...@gmail.com

unread,
Nov 14, 2018, 8:38:21 AM11/14/18
to Kylo Community
Hi,

Am having the same issue please help me to solve this issue.

Regards,
Dinesh D


application.properties
nifi-app.log

dinesh.h...@gmail.com

unread,
Nov 14, 2018, 8:41:38 AM11/14/18
to Kylo Community
Hi VIjay,

Could you please share me the configuration file of kylo.

Regards,
Dinesh  D

dinesh.h...@gmail.com

unread,
Nov 14, 2018, 8:44:11 AM11/14/18
to Kylo Community
Hi Vijay ,

Please share me the Data ingest zip file.

Regards,
Dinesh D

ruslans.uralovs

unread,
Nov 15, 2018, 6:40:06 AM11/15/18
to Kylo Community
From Nifi log we can see that you are running Spark 1 code on Spark 2. It must be because your default Spark version is 2 but Nifi is configured to run Spark 1.

2018-11-14 17:01:52,869 INFO [stream error] c.t.nifi.v2.spark.ExecuteSparkJob ExecuteSparkJob[id=9a51322c-dbf8-3edf-c126-a8ff931c5ac6] Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.SQLContext.table(Ljava/lang/String;)Lorg/apache/spark/sql/DataFrame;
2018-11-14 17:01:52,869 INFO [stream error] c.t.nifi.v2.spark.ExecuteSparkJob ExecuteSparkJob[id=9a51322c-dbf8-3edf-c126-a8ff931c5ac6] 	at com.thinkbiganalytics.spark.SparkContextService16.toDataSet(SparkContextService16.java:43)

You have two options:

1) Reconfigure processors to run on Spark 1 if you have Spark 1 installed
2) Reconfigure Nifi for Spark 2 and you can run existing feeds on you default Spark version 2


For option 1)
To fix this go to Nifi and find all ExecuteSparkJob processors and change their SparkHome property to use Spark 1.
To avoid going into Nifi and updating properties manually you can update "nifi.executesparkjob.sparkhome" property in "/opt/kylo/kylo-services/conf/application.properties", restart kylo-services and re-import your templates.


For option 2)
To reconfigure Kylo jars for Nifi for a different Spark version you can execute following from command line:
export SPARK_PROFILE = <spark-profile, e.g. spark-v1 or spark-v2>
/opt/kylo/setup/nifi/update-nars-jars.sh <nifi-home> <kylo-setup-dir> <nifi-user> <nifi-user-group>

For example:
[root@sandbox ~]# export SPARK_PROFILE=spark-v2
[root@sandbox ~]# /opt/kylo/setup/nifi/update-nars-jars.sh -f /opt/nifi /opt/kylo/setup nifi users
The NIFI home folder is /opt/nifi using permissions  nifi:users
Updating the kylo nifi nar and jar files
Creating symlinks for NiFi version 1.6.0.jar compatible nars
Nar files and Jar files have been updated
[root@sandbox ~]# ll /opt/nifi/current/lib/app | grep kylo
lrwxrwxrwx 1 nifi users  96 Nov 15 10:44 kylo-spark-interpreter-jar-with-dependencies.jar -> /opt/nifi/data/lib/app/kylo-spark-interpreter-spark-v2-0.10.0-SNAPSHOT-jar-with-dependencies.jar
lrwxrwxrwx 1 nifi users  97 Nov 15 10:44 kylo-spark-job-profiler-jar-with-dependencies.jar -> /opt/nifi/data/lib/app/kylo-spark-job-profiler-spark-v2-0.10.0-SNAPSHOT-jar-with-dependencies.jar
lrwxrwxrwx 1 nifi users  96 Nov 15 10:44 kylo-spark-merge-table-jar-with-dependencies.jar -> /opt/nifi/data/lib/app/kylo-spark-merge-table-spark-v2-0.10.0-SNAPSHOT-jar-with-dependencies.jar
lrwxrwxrwx 1 nifi users  95 Nov 15 10:44 kylo-spark-multi-exec-jar-with-dependencies.jar -> /opt/nifi/data/lib/app/kylo-spark-multi-exec-spark-v2-0.10.0-SNAPSHOT-jar-with-dependencies.jar
lrwxrwxrwx 1 nifi users 101 Nov 15 10:44 kylo-spark-validate-cleanse-jar-with-dependencies.jar -> /opt/nifi/data/lib/app/kylo-spark-validate-cleanse-spark-v2-0.10.0-SNAPSHOT-jar-with-dependencies.jar

Reply all
Reply to author
Forward
0 new messages