Data ingestion from postgress to kylo

579 views
Skip to first unread message

mnan...@upgrade.com

unread,
May 4, 2017, 8:00:31 PM5/4/17
to Kylo Community
I am  new to kylo . My use case is to do a realtime replication from postgres RDS to another postgres RDS ( or Redshift) 
is there a way i can do real time replication from several DB 's  ( primarily postgres RDS ) 

If i am correct it only works with Mysql 

Murali 

Greg Hart

unread,
May 4, 2017, 8:13:03 PM5/4/17
to Kylo Community
Hi Murali,

Welcome to the Kylo community!

To use PostgreSQL you'll need to download the JDBC driver from https://jdbc.postgresql.org/ and place it into /opt/nifi/mysql/ and /opt/kylo/kylo-services/plugin/. Then restart kylo-services. This will allow both Kylo and NiFi to be able to access PostgreSQL databases.

Also, the sample templates that come with NiFi only provide the ability to copy data from a JDBC database into Hive. You'll need to provide your own NiFi template for copying data into a PostgreSQL or Redshift database.

Thanks!

Murali Nanduri

unread,
May 4, 2017, 8:19:40 PM5/4/17
to Kylo Community
I updated my contact email . 
I have sandbox downloaded . Let me see how i can get this working . Even if the data gets in to Hive ( should be fine ) . I can use spark on top of it for analytics . 
Thanks any way .  

Jagrut Sharma

unread,
May 4, 2017, 8:24:26 PM5/4/17
to Kylo Community
Hi Murali - Check out the PutSQL processor (https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.PutSQL/) that should help you design your template. 

Also, databases will generally have a more performant bulk load/sync utility that you can explore running via the ExecuteScript processor (https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/).

Thanks.
--
Jagrut

Murali Nanduri

unread,
May 5, 2017, 8:24:13 PM5/5/17
to Kylo Community
Greg , 
  I copied the postgres JDBC driver to both the directories as u mentioned . Still  postgres DB did not show up. 

    The next thing i did was ,un commented the postgres properties in kylo-services/conf/application.properties.  After un comment , kylo-ui invalidated the dladmin/thinkbig credentials . 
     if i connect to a local postgres server , do i need to pulg in those username,port and db etc for me to login to kylo-ui 
     Jagrut was talking about templates in the below email . Do i need those to test a few tables for testing .

    These might sound like silly questions , but i am just starting on kylo , so lack of understanding 
    All the vedio tutorials i see are ingestion from csv files .FYI , I am using the sandbox (virtualbox) 
     

Murali 

Greg Hart

unread,
May 5, 2017, 8:36:06 PM5/5/17
to Kylo Community
Hi Murali,

The PostrgeSQL properties in application.properties are only used for storing metadata about feeds and not for accessing data from PostgreSQL.

May I assume that you're creating a Data Ingest feed and want to select a PostgreSQL database as the source? In that case, you'll need to create a Data Source with your PostreSQL credentials. Click the Data Sources link on the left-hand side under the Admin section. Then click the orange button in the upper-right corner to create a new Data Source. Here's an example Data Source:

Name: PostgreSQL
Database Connection URL: jdbc:postgresql://localhost/mydatabase
Database Driver Class Name: org.postgresql.Driver
Database Driver Location(s): file:///opt/nifi/mysql/postgresql-42.0.0.jar
Database User: root
Password: hadoop

Greg Hart

unread,
May 8, 2017, 11:19:36 PM5/8/17
to Kylo Community
Hi Greg , 
    Got a chance work on it . 
    Still i do not see a place to add a new data source . 
    This is what i did so far 
     1) copy the postgress postgresql-42.0.0.jre6.jar  to /opt/nifi/mysql and /opt/kylo/kylo-services/plugin directory . 
     Permissions on the postgres jar file is  owned by root and 755 is the permission . 
     Restarted the kylo-services restart , still i do not see the place to add the data source .
    
   i looked the following thread , u answered that same question . 
   in the thread u mentioned that step 3 to add the  password nifi.service.<controller-service-name>.password 
   I also looked at the /opt/kylo/kylo-services/conf directory , there is a datasource-definitions.json file . Do i need to add postgress entry in there ?
   
    http://stackoverflow.com/questions/42117944/how-to-add-a-database-source-to-kylo
Murali 


Hi Murali,

The link you posted contains older instructions but should still work for what you're trying to do. You've already done step 2 and still need to do step 1, 3, and 4. Please try it and let me know how it goes.

Greg Hart

unread,
May 10, 2017, 7:06:25 PM5/10/17
to Kylo Community
I am getting the following error 
org.apache.nifi.reporting.InitializationException:can't load  Database Driver 
my postgres jar is in /opt/nifi/mysql directoryand the file permissions are 755 owner:nfi,group users 
Murali 

Hi Murali,

This looks like an Apache NiFi error. Please try asking in the Apache NiFi mailing lists or contacting Think Big for additional support options. If you have any more questions about Kylo, I'd be happy to try and answer them.

Thanks!
Reply all
Reply to author
Forward
0 new messages