Standard ingest failing when executing GetTableData query - possible PostgreSQL case sensitivity

Marek K.

unread,

Jun 22, 2018, 1:58:43 PM6/22/18

to Kylo Community

Hi

I am trying to use Kylo 0.9.1 with a PostgreSQL database (9.5.5) for the Kylo metastore. However, running the Sample Feed example provided in the documentation it fails at the InitializeFeed processor (standard-ingest group), with the error:

2018-06-22 18:11:27,178 INFO [Timer-Driven Process Thread-3] c.t.ingest.GetTableDataSupport Executing full GetTableData query SELECT tbl."registration_dttm",tbl."id",tbl."first_name",tbl."last_name",tbl."email",tbl."gender",tbl."ip_address",tbl."cc",tbl."country",tbl."birthdate",tbl."salary",tbl."title",tbl."comments" FROM NA tbl

2018-06-22 18:11:27,181 ERROR [Timer-Driven Process Thread-3] c.t.nifi.v2.ingest.GetTableData GetTableData[id=d46dde68-e119-3ac1-005c-abcdab87b3df] Unable to execute SQL select from table due to StandardFlowFileRecord[uuid=6b8609b9-add1-4a9b-896b-fa3cd2d68021,claim=,offset=0,name=7357899322044403,size=0]; routing to failure: org.apache.nifi.processor.exception.ProcessException: IOException thrown from GetTableData[id=d46dde68-e119-3ac1-005c-abcdab87b3df]: java.io.IOException: SQL execution failure

.

Caused by: org.postgresql.util.PSQLException: ERROR: relation "na" does not exist

The PostgreSQL serverlog shows:

2018-06-22 18:11:27.169 BST,"kylo901","kylo901",2002,"10.60.47.228:47562",5b2d274b.7d2,58,"PARSE",2018-06-22 17:43:55 BST,23/69774,0,ERROR,42P01,"relation ""na"" does not exist",,,,,,"SELECT tbl.""registration_dttm"",tbl.""id"",tbl.""first_name"",tbl.""last_name"",tbl.""email"",tbl.""gender"",tbl.""ip_address"",tbl.""cc"",tbl.""country"",tbl.""birthdate"",tbl.""salary"",tbl.""title"",tbl.""comments"" FROM NA tbl",200,,"PostgreSQL JDBC Driver"

2018-06-22 18:11:54.063 BST,,,4636,"10.60.47.228:35232",5b2d2dda.121c,1,"",2018-06-22 18:11:54 BST,,0,LOG,00000,"connection received: host=xx.xx.xx.xxx port=35232",,,,,,,,,""

This behaviour seems to match the issue reported in KYLO-1219, but that issue is marked as resolved. Could you comment on whether the above symptoms do indeed suggest that this is due to case sensitivity in the PostgreSQL database or suggest another possible root cause. Thank you.

Regards,

Marek

Marek K.

unread,

Jun 23, 2018, 5:08:14 PM6/23/18

to Kylo Community

Uploading screenshot of Kylo-UI and NiFi log as supporting evidence.

failed_standard_ingest_kyloUI.JPG

failed_standard_ingest_nifi-app.log

Greg Hart

unread,

Jun 26, 2018, 11:52:59 AM6/26/18

to Kylo Community

Hi Marek,

It looks like the table name needs to be quoted and that wasn't fixed as part of the JIRA. Could you please try running the query below to verify that's the only issue?

Marek K.

unread,

Jun 27, 2018, 7:45:16 PM6/27/18

to Kylo Community

Hi Greg,

Thanks for your response; I'm beginning to fear there may a more fundamental issue at play here. Today I attempted to replace my Postgres Kylo metastore with a mariadb (10.3.7) metastore. Unfortunately I got a similar error, albeit without the na to NA name mismatch shown in my earlier Postgres example (nifi-app log attached as failed_standard_ingest_mariadb_nifi-app.log).

Checking the list of tables present in the kylo schema I see that there is actually no table called NA present, so I am left wondering if this is some kind to temporary table, or whether there has been an issue when creating the kylo schema during the initial build. On the latter point, I did encounter the reported issue with the handling of the 1970-01-01 00:00:01 timestamp in the MYSQL build, and accordingly needed to edit and run kylo-db-update-script.sql to get the FEED table created, as the original build via liquibase was incomplete, but I’ve checked both the kylo schema in both my Postgres and MySQL builds and see no table called NA present. Below is a list of the tables present in the mariadb example.

MariaDB [kylo]> show tables;

+--------------------------------+

| Tables_in_kylo |

+--------------------------------+

| AUDIT_LOG |

| BATCH_EXECUTION_CONTEXT_VALUES |

| BATCH_FEED_SUMMARY_COUNTS_VW |

| BATCH_JOB_EXECUTION |

| BATCH_JOB_EXECUTION_CTX_VALS |

| BATCH_JOB_EXECUTION_PARAMS |

| BATCH_JOB_EXECUTION_SEQ |

| BATCH_JOB_INSTANCE |

| BATCH_JOB_SEQ |

| BATCH_NIFI_JOB |

| BATCH_NIFI_STEP |

| BATCH_RELATED_FLOW_FILES |

| BATCH_STEP_EXECUTION |

| BATCH_STEP_EXECUTION_CTX_VALS |

| BATCH_STEP_EXECUTION_SEQ |

| CHECK_DATA_TO_FEED_VW |

| DATABASECHANGELOG |

| DATABASECHANGELOGLOCK |

| FEED |

| FEED_ACL_INDEX |

| FEED_CHECK_DATA_FEEDS |

| FEED_HEALTH_VW |

| FEED_SUMMARY_VIEW |

| GENERATED_KEYS |

| ITEM_LAST_MODIFIED |

| KYLO_ALERT |

| KYLO_ALERT_CHANGE |

| KYLO_VERSION |

| LATEST_FEED_JOB_END_TIME_VW |

| LATEST_FEED_JOB_VW |

| LATEST_FINISHED_FEED_JOB_VW |

| MODESHAPE_REPOSITORY |

| NIFI_EVENT |

| NIFI_FEED_PROCESSOR_STATS |

| NIFI_FEED_STATS |

| NIFI_FLOW_CACHE_CLUSTER_SYNC |

| NIFI_FLOW_CACHE_UPDATE_ITEM |

| NIFI_RELATED_ROOT_FLOW_FILES |

| SLA_ACTION_TEMPLATE |

| SLA_ASSESSMENT |

| SLA_DESCRIPTION |

| SLA_FEED |

| SLA_METRIC_ASSESSMENT |

| SLA_OBLIGATION_ASSESSMENT |

| TMP_BATCH_NIFI_JOB_DELETED |

| VELOCITY_TEMPLATE |

+--------------------------------+

46 rows in set (0.000 sec)

This also tallies with the tables present in the Kylo VirtualBox Sandbox provided, so this leaves me rather bemused as to where I may have gone wrong.

Regards,

Marek

failed_standard_ingest_mariadb_nifi-app.log

Greg Hart

unread,

Jun 28, 2018, 12:12:13 PM6/28/18

to Kylo Community

Hi Marek,

I don't think this is related to the Kylo metastore but is specific to the feed you're creating. Could you please attach or link to the instructions you're using to create the feed?

Marek K.

unread,

Jun 28, 2018, 5:58:01 PM6/28/18

to Kylo Community

Hi Greg, the feed I'm creating is simply the standard test case given in the documentation i.e.,

https://kylo.readthedocs.io/en/v0.9.1/installation/RunSampleFeed.html

Naturally I've had to amend the default data_ingest.zip template to amend a couple of parameters (such as the installation patch from /opt to our standard /app, and also the database connections details), so I've attached the modified templates for the Postgres and MySQL Kylo metastore test cases that I've been attempting.

Regards,

Marek

MySQLMetastore_data_ingest.zip

PostgresMetastore_data_ingest.zip

Marek K.

unread,

Jun 28, 2018, 6:24:14 PM6/28/18

to Kylo Community

... I should add also, that I've successfully run the same test case in the VirtualBox Sandbox ( albeit 0.9.0.1 rather than 0.9.1)

Greg Hart

unread,

Jun 28, 2018, 7:08:56 PM6/28/18

to Kylo Community

Hi Marek,

I think the "NA" is coming from your modified templates. It's not there in a standard Kylo install. Could you try re-importing the /opt/kylo/setup/data/templates/nifi-1.0/data_ingest.zip template and creating a new feed to see if you still have the same issue?

The Kylo metastore should not be accessed from NiFi. The Data Ingest template includes a generic MySQL database connection that can be used to import tables (such as user-defined or sample tables) into Hive, but it's never used to access the Kylo metastore. If you'd like to create a new database connection to another MySQL or PostgreSQL server then please see the Kylo Datasources Guide:

http://kylo.readthedocs.io/en/latest/how-to-guides/KyloDatasources.html

The /app paths can be changed in /opt/kylo/kylo-services/conf/application.properties by changing the properties below. You will need to re-import the Data Ingest template for the changes to take effect.

# Set the nifi home folder
config.nifi.home=/opt/nifi

# Set the spark application jars location
config.nifi.kylo.applicationJarDirectory=/opt/nifi/current/lib/app

Marek K.

unread,

Jul 5, 2018, 11:48:22 AM7/5/18

to Kylo Community

Hi Greg,

I believe that “NA” identifier is coming from the “Fetch RDBMS Data/GetTableData” processor, (parameter, ‘Source Table’). I’ve verified this by changing NA to some other arbitrary identifier and observing that the error in the nifi log changed accordingly.

Since the Source Database connection only allows me to set it to “Kylo MySQL”, isn’t this referring to the relational database (whether that be on Postgres or Mariadb) that (along with JBoss ModeShape) comprises the Kylo Metadata Server - or have I fundamentally misunderstood what is going on here?

Strangely enough if I compare this with the config on my Cloudera Sandbox environment, that also has ‘Source Table = NA’ in the “Fetch RDBMS Data/GetTableData” processor - but I don’t see any corresponding schema in the MySQL database that comes pre-canned with that deployment. Looking at the “Source Fields” it seems to be referencing a table with the columns corresponding to the userdata1.csv file used in the Sample Feed example being used, so it would seem to make sense that this is processor is trying to read the metadata that defines the feed highwater mark, but where that information actually resides and why what the correct value for the ‘Source Table’ parameter is the bit that is still eluding me.

Regards,

Marek

GetTableData.jpg

sample_ingest_flow.jpg

Greg Hart

unread,

Jul 5, 2018, 12:53:49 PM7/5/18

to Kylo Community

Hi Marek,

The 'Kylo MySQL' controller service is just an example of connecting to a MySQL database and is unrelated to the Kylo metadata store. It's there so new installs of Kylo have at least one Source Database connection instead of zero. You can add additional connections through the Admin -> Data Sources page.

The 'Source Table' property of the 'Fetch RDBMS Data' processor should be set to the table name you selected in the Kylo UI when setting up the feed. Since it looks like this is not the case then please try re-importing the samples/templates/nifi-1.5/advanced_ingest.template.zip template and creating a new feed using the Advanced Ingest template.

Reply all

Reply to author

Forward